Docker Enterprise is a standards-based container platform for development and delivery of modern applications. Docker Enterprise is designed for application developers and IT teams who build, share, and run business-critical applications at scale in production. Docker Enterprise provides a consistent and secure end-to-end application pipeline, choice of tools and languages, and globally consistent Kubernetes environments that run in any cloud.
Docker Enterprise enables deploying highly available workloads using either the Docker Kubernetes Service or Docker Swarm. You can join thousands of physical or virtual machines together to create a cluster, allowing you to deploy your applications at scale and to manage your clusters from a centralized place.
Docker Enterprise automates many of the tasks that orchestration requires, like provisioning pods, containers, and cluster resources. Self-healing components ensure that Docker Enterprise clusters remain highly available.
The Docker Kubernetes Service fully supports all Docker Enterprise features, including role-based access control, LDAP/AD integration, image scanning and signing enforcement policies, and security policies.
Docker Kubernetes Services features include:
In addition, UCP integrates with Kubernetes by using admission controllers, which enable:
NodeSelector
automatically
to workloads via admission controlPodSecurityPolicy
admission controllerThe default Docker Enterprise installation includes both Kubernetes and Swarm components across the cluster, so every newly joined worker node is ready to schedule Kubernetes or Swarm workloads.
Docker Enterprise exposes the standard Kubernetes API, so you can use kubectl to manage your Kubernetes workloads:
kubectl cluster-info
Which produces output similar to the following:
Kubernetes master is running at https://54.200.115.43:6443
KubeDNS is running at https://54.200.115.43:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info
dump'.
Docker Enterprise has its own built-in authentication mechanism with role-based access control (RBAC), so that you can control who can access and make changes to your cluster and applications. Also, Docker Enterprise authentication integrates with LDAP services and supports SAML SCIM to proactively synchronize with authentication providers. You can also opt to enable the PKI authenticati onto use client certificates, rather than username and password.
Docker Enterprise integrates with Docker Trusted Registry so that you can keep the Docker images you use for your applications behind your firewall, where they are safe and can’t be tampered with. You can also enforce security policies and only allow running applications that use Docker images you know and trust.
Windows applications typically require Active Directory authentication in order to communicate with other services on the network. Container-based applications use Group Managed Service Accounts (gMSA) to provide this authentication. Docker Swarm fully supports the use of gMSAs with Windows containers.
Docker Enterprise exposes the standard Docker API, so you can continue using the tools that you already know, including the Docker CLI client, to deploy and manage your applications.
For example, you can use the docker info command to check the status of a Swarm managed by Docker Enterprise:
docker info
Which produces output similar to the following:
Containers: 38
Running: 23
Paused: 0
Stopped: 15
Images: 17
Server Version: 17.06
...
Swarm: active
NodeID: ocpv7el0uz8g9q7dmw8ay4yps
Is Manager: true
ClusterID: tylpv1kxjtgoik2jnrg8pvkg6
Managers: 1
...
This document provides instructions and best practices for Docker Enterprise backup procedures for all components of the platform.
To back up Docker Enterprise, you must create individual backups for each of the following components:
If you do not create backups for all components, you cannot restore your deployment to its previous state.
Test each backup you create. One way to test your backups is to do a fresh installation on a separate infrastructure with the backup. Refer to Restore Docker Enterprise for additional information.
Note: Application data backup is not included in this information. Persistent storage data backup is the responsibility of the storage provider for the storage plugin or driver.
You should only restore Docker Enterprise Edition from a backup as a last resort. If you’re running Docker Enterprise in high-availability mode, you can remove unhealthy nodes from the swarm and join new ones to bring the swarm to an healthy state.
To restore Docker Enterprise, restore components individually and in the following order:
In many organizations, authenticating to systems with a username and password combination is either restricted or outright prohibited. With Docker Enterprise 3.0, UCP’s CLI client certificate-based authentication has been extended to the web user interface (web UI). DTR has also been enhanced to work with UCP’s internally generated client bundles for client certificate-based authentication. If you have an external public key infrastructure (PKI) system, you can manage user authentication using a pool of X.509 client certificates in lieu of usernames and passwords.
The following table outlines existing and added capabilities when using client certificates — both internal to UCP and issued by an external certificate authority (CA) — for authentication.
O pera tion | Benefit |
---|---|
`UCP bro wser auth enti cati on < #ucp --dt r-br owse r-au then tica tion >`__ | Previously, UCP client bundles enabled communication between a local Docker client and UCP without the need of a username and password. Importing your client certificates into the browser extends this capability to the UCP web UI. |
`DTR bro wser auth enti cati on < #ucp --dt r-br owse r-au then tica tion >`__ | You can bypass the login page for the DTR web UI when you use TLS client certificates as a DTR authentication method. |
`I mage p ulls and pu shes to DTR
mage -pul ls-a nd-p ushe s-to -dtr >`__ |
You can update Docker engine with a client certificate for
image pulls and pushes to DTR without the need for
docker login . |
`I mage sig ning
mage -sig ning >`__ |
You can use client certificates to sign images that you push to DTR. Depending on which you configure to talk to DTR, the certificate files need to be located in certain directories. Alternatively, you can enable system-wide trust of your custom root certificates. |
`DTR API acc ess <#dt r-ap i-ac cess >`__ | You can use TLS client certificates in lieu of your user credentials to access the DTR API. |
`No tary CLI op erat ions with DTR <#no tary -cli -ope rati ons- with -dtr >`__ | You can set your DTR as the remote trust server location and pass the certificate flags directly to the Notary CLI to access your DTR repositories. |
docker push
and docker pull
operations for all users of
the same machine.The following instructions apply to UCP and DTR administrators. For non-admin users, contact your administrator for details on your PKI’s client certificate configuration.
To bypass the browser login pages and hide the logout buttons for both UCP and DTR, follow the steps below.
Add your organization’s root CA certificates via the UCP web UI or the CLI installation command. For testing purposes, you can download an admin client bundle from UCP and convert the client certificates to ``pkcs12` <#convert-your-client-certificates-to-a-PKCS12-file>`__
Download UCP’s ca.pem
from https://<ucp-url>/ca
either in the
browser or via curl
. When using curl
, redirect the response
output to a file. curl -sk https://<ucp-url>/ca -o ca.pem
Enable client certificate authentication for DTR. If previously installed, reconfigure DTR with your UCP hostname’s root CA certificate. This will be your organization’s root certificate(s) appended to UCP’s internal root CA certificates.
docker run --rm -it docker/dtr:2.7.0 reconfigure --debug --ucp-url \
<ucp-url> --ucp-username <ucp_admin_user> --ucp-password \ <ucp_admin_password> --enable-client-cert-auth
--client-cert-auth-ca "$(cat ca.pem)"
See DTR installation and DTR reconfiguration CLI reference pages for an explanation of the different options.
Import the PKCS12 file into the browser or Keychain Access if you’re running macOS.
From the command line, switch to the directory of your client bundle and
run the following command to convert the client bundle public and
private key pair to a .p12
file.
openssl pkcs12 -export -out cert.p12 -inkey key.pem -in cert.pem
Create with a simple password, you will be prompted for it when you import the certificate into the browser or Mac’s Keychain Access.
Instructions on how to import a certificate into a web browser vary according to your platform, OS, preferred browser and browser version. As a general rule, refer to one of the following how-to articles: - Firefox: https://www.sslsupportdesk.com/how-to-import-a-certificate-into-firefox/ - Chrome: https://www.comodo.com/support/products/authentication_certs/setup/win_chrome.php - Internet Explorer: https://www.comodo.com/support/products/authentication_certs/setup/ie7.php
For pulling and pushing images to your DTR (with client certificate
authentication method enabled) without performing a docker login
, do
the following:
Create a directory for your DTR public address or FQDN (Fully Qualified Domain Name) within your operating system’s TLS certificate directory.
As a superuser, copy
the private key (client.pem
) and certificate (client.cert
) to
the machine you are using for pulling and pushing to DTR without
doing a docker login
. Note that the filenames must match.
Obtain the CA certificate from your DTR server, ca.crt
from
https://<dtrurl>/ca
, and copy ca.crt
to your operating
system’s TLS certificate directory so that your machine’s Docker
Engine will trust DTR. For Linux, this is
/etc/docker/certs.d/<dtrurl>/
. On Docker for Mac, this is
/<home_directory>/certs.d/<dtr_fqdn>/
.
This is a convenient alternative to, for Ubuntu as an example, adding
the DTR server certificate to /etc/ca-certs
and running
update-ca-certificates
.
curl curl -k https://<dtr>/ca -o ca.crt
On Ubuntu `bash cp ca.crt /etc/ca-certs
Restart the Docker daemon for the changes to take effect. See Configure your host for different ways to restart the Docker daemon.
You have the option to add your DTR server CA certificate to your
system’s trusted root certificate pool. This is MacOS Keychain or
/etc/ca-certificates/
on Ubuntu. Note that you will have to remove
the certificate if your DTR public address changes.
DTR provides the Notary service for using Docker Content Trust (DCT) out of the box.
Implementation | Component Pairing | Settings |
---|---|---|
Sign with docker trust sign |
| Copy ca.crt from https://<dtr-external-url>/ca to:
|
Enforce signature or hash verification on the Docker client |
| export DOCKER_CONTENT_TRUST=1 to enable content trust on the Docker client. Copy ca.crt from https://<dtr-external-url>/ca to /<home_directory>/.docker/tls/ on Linux and macOS. docker push will sign your images. |
Sign images that UCP can trust |
| Configure UCP to run only signed images. See Sign an image for detailed steps. |
With curl
, you can interact with the DTR API by passing a public
certificate and private key pair instead of your DTR username and
password/authentication token.
curl --cert cert.pem --key key.pem -X GET \
"https://<dtr-external-url>/api/v0/repositories?pageSize=10&count=false" \
-H "accept:application/json"
In the above example, cert.pem
contains the public certificate and
key.pem
contains the private key. For non-admin users, you can
generate a client bundle from UCP or contact your administrator for your
public and private key pair.
For Mac-specific quirks, see curl on certain macOS versions.
For establishing mutual trust between the Notary client and your trusted
registry (DTR) using the Notary CLI, place your TLS client certificates
in <home_directory>/.docker/tls/<dtr-external-url>/
as
client.cert
and client.key
. Note that the filenames must match.
Pass the FQDN or publicly accessible IP address of your registry along
with the TLS client certificate options to the Notary client. To get
started, see Use the Notary client for advanced
users.
Self-signed DTR server certificate
Also place
ca.crt
in<home_directory>/.docker/tls/<dtr-external-url>/
when you’re using a self-signed server certificate for DTR.
Hit your DTR’s basic_info
endpoint via curl
:
curl --cert cert.pem --key key.pem -X GET "https://<dtr-external-url>/basic_info"
If successfully configured, you should see TLSClientCertificate
listed as the AuthnMethod
in the JSON response.
{
"CurrentVersion": "2.7.0",
"User": {
"name": "admin",
"id": "30f53dd2-763b-430d-bafb-dfa361279b9c",
"fullName": "",
"isOrg": false,
"isAdmin": true,
"isActive": true,
"isImported": false
},
"IsAdmin": true,
"AuthnMethod": "TLSClientCertificate"
}
Avoid adding DTR to Docker Engine’s list of insecure registries as a workaround. This has the side effect of disabling the use of TLS certificates.
Error response from daemon: Get https://35.165.223.150/v2/: x509: certificate is valid for 172.17.0.1, not 35.165.223.150
--dtr-external-url
option and the associated PEM files for your
certificate.For chain of trust which includes intermediate certificates, you may
optionally add those certificates when installing or reconfiguring DTR
with --enable-client-cert-auth
and --client-cert-auth-ca
. You
can do so by combining all of the certificates into a single PEM file.
Some versions of macOS include curl
which only accepts .p12
files and specifically requires a ./
prefix in front of the file
name if running curl
from the same directory as the .p12
file:
curl --cert ./client.p12 -X GET \
"https://<dtr-external-url>/api/v0/repositories?pageSize=10&count=false" \
-H "accept:application/json"
Docker Engine - Enterprise version 17.06 and later includes a telemetry plugin. The plugin is enabled by default on Ubuntu starting with Docker Engine - Enterprise 17.06.0 and on the rest of the Docker Engine - Enterprise supported Linux distributions starting with version 17.06.2-ee-5. The telemetry plugin is not part of Docker Engine - Enterprise for Windows Server.
The telemetry plugin sends system information to Docker Inc. Docker uses this information to improve Docker Engine - Enterprise. For details about the telemetry plugin and the types of data it collects, see the telemetry plugin documentation.
If your Docker instance runs in an environment with no internet connectivity, the telemetry plugin does not collect or attempt to send any information to Docker Inc.
If you don’t wish to send any usage data to Docker Inc., you can disable the plugin, either using the Docker CLI or using Universal Control Plane.
Warning
If you’re using Docker Engine - Enterprise with Universal Control Plane (UCP), use UCP to enable and disable metrics. Use the CLI only if you don’t have UCP. UCP re-enables the telemetry plugin for hosts where it was disabled with the CLI.
If you use Universal Control Plane with Docker Engine - Enterprise, do not use the Docker CLI to disable the telemetry plugin. Instead, you can manage the information sent to Docker by going to Admin Settings and choosing Usage.
To disable the telemetry plugin, disable all three options and click Save. Enabling either or both of the top two options will enable the telemetry plugin. You can find out more about an individual option by clicking the ? icon.
Important
If API usage statistics are enabled, Docker gathers only aggregate stats about what API endpoints are used. API payload contents aren’t collected.
At the engine level, there is a telemetry module built into the Docker
Enterprise Engine 18.09 or newer. It can be disabled by modifing the
daemon configuration file. By default this is stored
at /etc/docker/daemon.json
.
{
"features": {
"telemetry": false
}
}
For the Docker daemon to pick up the changes in the configuration file, the Docker daemon will need to be restarted.
$ sudo systemctl reboot docker
To reenable the telemetry module, swap the value to
"telemetry": true
or completely remove the "telemetry": false
line, as the default value is true
.
For Docker Enterprise Engine 18.03 or older, the telemetry module ran as
a Docker Plugin. To disable the telemetry plugin, use the
docker plugin disable
with either the plugin NAME or ID:
$ docker plugin ls
ID NAME [..]
114dbeaa400c docker/telemetry:1.0.0.linux-x86_64-stable [..]
$ docker plugin disable docker/telemetry:1.0.0.linux-x86_64-stable
This command must be run on each Docker host.
To re-enable the telemetry plugin, you can use docker plugin enable
with either the plugin NAME or ID:
$ docker plugin ls
ID NAME [..]
114dbeaa400c docker/telemetry:1.0.0.linux-x86_64-stable [..]
$ docker plugin enable docker/telemetry:1.0.0.linux-x86_64-stable
To upgrade Docker Enterprise, you must individually upgrade each of the following components:
Because some components become temporarily unavailable during an upgrade, schedule upgrades to occur outside of peak business hours to minimize impact to your business.
Docker Engine - Enterprise upgrades in Swarm clusters should follow these guidelines in order to avoid IP address space exhaustion and associated application downtime.
Before upgrading Docker Engine - Enterprise, you should make sure you create a backup. This makes it possible to recover if anything goes wrong during the upgrade.
You should also check the compatibility matrix, to make sure all Docker Engine - Enterprise components are certified to work with one another. You may also want to check the Docker Engine - Enterprise maintenance lifecycle, to understand until when your version may be supported.
Before you upgrade, make sure:
Certificates
Externally signed certificates are used by the Kubernetes API server and the UCP controller. {: .important}
In Swarm overlay networks, each task connected to a network consumes an
IP address on that network. Swarm networks have a finite amount of IPs
based on the --subnet
configured when the network is created. If no
subnet is specified then Swarm defaults to a /24
network with 254
available IP addresses. When the IP space of a network is fully
consumed, Swarm tasks can no longer be scheduled on that network.
Starting with Docker Engine - Enterprise 18.09 and later, each Swarm node will consume an IP address from every Swarm network. This IP address is consumed by the Swarm internal load balancer on the network. Swarm networks running on Engine versions 18.09 or greater must be configured to account for this increase in IP usage. Networks at or near consumption prior to engine version 18.09 may have a risk of reaching full utilization that will prevent tasks from being scheduled on to the network.
Maximum IP consumption per network at any given moment follows the following formula:
Max IP Consumed per Network = Number of Tasks on a Swarm Network + 1 IP for each node where these tasks are scheduled
To prevent this from happening, overlay networks should have enough capacity prior to an upgrade to 18.09, such that the network will have enough capacity after the upgrade. The below instructions offer tooling and steps to ensure capacity is measured before performing an upgrade.
The above following only applies to containers running on Swarm overlay networks. This does not impact bridge, macvlan, host, or 3rd party docker networks.
To avoid application downtime, you should be running Docker Engine - Enterprise in Swarm mode and deploying your workloads as Docker services. That way you can drain the nodes of any workloads before starting the upgrade.
If you have workloads running as containers as opposed to swarm services, make sure they are configured with a restart policy. This ensures that your containers are started automatically after the upgrade.
To ensure that workloads running as Swarm services have no downtime, you need to:
If you do this sequentially for every node, you can upgrade with no application downtime. When upgrading manager nodes, make sure the upgrade of a node finishes before you start upgrading the next node. Upgrading multiple manager nodes at the same time can lead to a loss of quorum, and possible data loss.
Starting with a cluster with one or more services configured, determine whether some networks may require updating the IP address space in order to function correctly after an Docker Engine - Enterprise 18.09 upgrade.
$ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock docker/ip-util-check
If the network is in danger of exhaustion, the output will show similar warnings or errors:
Overlay IP Utilization Report
----
Network ex_net1/XXXXXXXXXXXX has an IP address capacity of 29 and uses 28 addresses
ERROR: network will be over capacity if upgrading Docker engine version 18.09
or later.
----
Network ex_net2/YYYYYYYYYYYY has an IP address capacity of 29 and uses 24 addresses
WARNING: network could exhaust IP addresses if the cluster scales to 5 or more nodes
----
Network ex_net3/ZZZZZZZZZZZZ has an IP address capacity of 61 and uses 52 addresses
WARNING: network could exhaust IP addresses if the cluster scales to 9 or more nodes
With an exhausted network, you can triage it using the following steps.
docker service ls
output. It will display the service
that is unable to completely fill all its replicas such as:ID NAME MODE REPLICAS IMAGE PORTS
wn3x4lu9cnln ex_service replicated 19/24 nginx:latest
docker service ps ex_service
to find a failed replica such
as:ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
...
i64lee19ia6s \_ ex_service.11 nginx:latest tk1706-ubuntu-1 Shutdown Rejected 7 minutes ago "node is missing network attac…"
...
docker inspect
. In this example, the
docker inspect i64lee19ia6s
output shows the error in the
Status.Err
field:...
"Status": {
"Timestamp": "2018-08-24T21:03:37.885405884Z",
"State": "rejected",
"Message": "preparing",
**"Err": "node is missing network attachments, ip addresses may be exhausted",**
"ContainerStatus": {
"ContainerID": "",
"PID": 0,
"ExitCode": 0
},
"PortStatus": {}
},
...
The following is a constraint introduced by architectural changes to the Swarm overlay networking when upgrading to Docker Engine - Enterprise 18.09 or later. It only applies to this one-time upgrade and to workloads that are using the Swarm overlay driver. Once upgraded to Docker Engine - Enterprise 18.09, this constraint does not impact future upgrades.
When upgrading to Docker Engine - Enterprise 18.09, manager nodes cannot reschedule new workloads on the managers until all managers have been upgraded to the Docker Engine - Enterprise 18.09 (or higher) version. During the upgrade of the managers, there is a possibility that any new workloads that are scheduled on the managers will fail to schedule until all of the managers have been upgraded.
In order to avoid any impactful application downtime, it is advised to reschedule any critical workloads on to Swarm worker nodes during the upgrade of managers. Worker nodes and their network functionality will continue to operate independently during any upgrades or outages on the managers. Note that this restriction only applies to managers and not worker nodes.
If you are running live application on the cluster while upgrading, remove applications from nodes being upgrades as to not create unplanned outages.
Start by draining the node so that services get scheduled in another node and continue running without downtime.
For that, run this command on a manager node:
$ docker node update --availability drain <node>
To upgrade a node individually by operating system, please follow the instructions listed below:
After all manager and worker nodes have been upgrades, the Swarm cluster
can be used again to schedule new workloads. If workloads were
previously scheduled off of the managers, they can be rescheduled again.
If any worker nodes were drained, they can be undrained again by setting
--availability active
.
Docker Engine - Enterprise is a client-server application with these major components:
Docker Engine - Enterprise can be installed on several linux distros as well as on Windows.
This document describes the latest changes, additions, known issues, and fixes for Docker Engine - Enterprise (Docker EE).
Docker EE is a superset of all the features in Docker CE. It incorporates defect fixes that you can use in environments where new features cannot be adopted as quickly for consistency and compatibility reasons.
Note
New in 18.09 is an aligned release model for Docker Engine - Enterprise. The new versioning scheme is YY.MM.x where x is an incrementing patch version. The enterprise engine is a superset of the community engine. They will ship concurrently with the same x patch version based on the same code base.
Note
The client and container runtime are now in separate packages from
the daemon in Docker Engine 18.09. Users should install and update
all three packages at the same time to get the latest patch releases.
For example, on Ubuntu:
sudo apt-get install docker-ee docker-ee-cli containerd.io
. See
the install instructions for the corresponding linux distro for
details.
(2020-11-12)
(2020-08-10)
(2020-06-24)
2019-11-14
--default-addr-pool
for docker swarm init
not picked up by ingress network.
docker/swarmkit#28922019-10-08
docker rmi
stuck in case of misconfigured system (such as
dead NFS share).
docker/engine#336max-size
is set and
max-file=1
.
docker/engine#3772019-09-03
--config-only
networks --config-from
networkshave ungracefully exited.
docker/libnetwork#23732019-07-17
docker stack deploy
is used to redeploy a stack which includes
non-external secrets, the logs will contain the secret.parallelism
and
max_failure_ratio
fields.2019-06-27
--network-rm
would fail to remove a network.
moby/moby#39174docker service create --limit-cpu
.
moby/moby#391902019-05-06
COPY
and ADD
with multiple <src>
to not invalidate
cache if
DOCKER_BUILDKIT=1
.moby/moby#389642019-04-11
DOCKER_BUILDKIT=1 docker build --squash ..
docker/engine#176network=host
using wrong resolv.conf
with
systemd-resolved
.
docker/engine#180Restarting
.
docker/engine#1812019-03-28
git ref
to avoid misinterpretation as a
flag. moby/moby#38944docker cp
error for filenames greater than 100 characters.
moby/moby#38634layer/layer_store
to ensure NewInputTarStream
resources
are released.
moby/moby#38413GetConfigs
.
moby/moby#38800containerd
1.2.5.
docker/engine#1732019-02-28
2019-02-11
runc
to address a critical vulnerability that allows
specially-crafted containers to gain administrative privileges on the
host.
CVE-2019-5736For additional information, refer to the Docker blog post.
2019-01-09
In Docker versions prior to 18.09, containerd was managed by the Docker
engine daemon. In Docker Engine 18.09, containerd is managed by systemd.
Since containerd is managed by systemd, any custom configuration to the
docker.service
systemd configuration which changes mount settings
(for example, MountFlags=slave
) breaks interactions between the
Docker Engine daemon and containerd, and you will not be able to start
containers.
Run the following command to get the current value of the MountFlags
property for the docker.service
:
sudo systemctl show --property=MountFlags docker.service
MountFlags=
Update your configuration if this command prints a non-empty value for
MountFlags
, and restart the docker service.
/proc/asound
to masked paths
docker/engine#126containerd
docker/engine#122service update --force
docker/cli#1526docker kill
docker/engine#116containerd
is not
upgraded to the correct version on Ubuntu. Learn
more.2018-11-08
In Docker versions prior to 18.09, containerd was managed by the Docker
engine daemon. In Docker Engine 18.09, containerd is managed by systemd.
Since containerd is managed by systemd, any custom configuration to the
docker.service
systemd configuration which changes mount settings
(for example, MountFlags=slave
) breaks interactions between the
Docker Engine daemon and containerd, and you will not be able to start
containers.
Run the following command to get the current value of the MountFlags
property for the docker.service
:
sudo systemctl show --property=MountFlags docker.service
MountFlags=
Update your configuration if this command prints a non-empty value for
MountFlags
, and restart the docker service.
/info
endpoint, and move detection to the
daemon moby/moby#37502--secret
flag when
using BuildKit
docker/cli#1288docker build --ssh $SSHMOUNTID=$SSH_AUTH_SOCK
) when using
BuildKit
docker/cli#1438 /
docker/cli#1419--chown
flag support for ADD
and COPY
commands on
Windows moby/moby#35521builder prune
subcommand to prune BuildKit build cache
docker/cli#1295
docker/cli#1334docker build --pull ...
when using
BuildKit
moby/moby#37613docker engine
subcommand to manage the lifecycle of a
Docker Engine running as a privileged container on top of containerd,
and to allow upgrades to Docker Engine Enterprise
docker/cli#1260docker info
output
docker/cli#1313docker info
output
docker/cli#1225awslogs-endpoint
logging option
moby/moby#37374POST /session
endpoint out of experimental.
moby/moby#40028<unknown>
” in /info response
moby/moby#37472--console=[auto,false,true]
to
--progress=[auto,plain,tty]
docker/cli#1276--data-path-addr
flags when connected to a daemon that
doesn’t support this option
docker/docker/cli#1240-ce
/ -ee
suffix from version string
docker-ce-packaging#206COPY
/ADD
.
moby/moby#37563trust inspect
typo: “AdminstrativeKeys
”
docker/cli#1300docker image prune
with a large list of dangling images
docker/cli#1432 /
docker/cli#1423/etc/docker
directory to prevent
“permission denied” errors when using docker manifest inspect
docker/engine#56 /
moby/moby#37847cpuset-cpus
and
cpuset-mems
docker/engine#70 /
moby/moby#37967--platform
to docker import
docker/cli#1375 /
docker/cli#1371--follow
docker/engine#48
moby/moby#37576
moby/moby#37734CAP_SYS_NICE
in default
seccomp profile
moby/moby#37242CAP_SYS_ADMIN
or
CAP_SYSLOG
docker/engine#64 /
moby/moby#37929There are important changes to the upgrade process that, if not correctly followed, can have impact on the availability of applications running on the Swarm during upgrades. These constraints impact any upgrades coming from any version before 18.09 to version 18.09 or greater.
With https://github.com/boot2docker/boot2docker/releases/download/v18.09.0/boot2docker.iso, connection is being refused from a node on the virtual machine. Any publishing of swarm ports in virtualbox-created docker-machine VM’s will not respond. This is occurring on macOS and Windows 10, using docker-machine version 0.15 and 0.16.
The following docker run
command works, allowing access from host
browser:
docker run -d -p 4000:80 nginx
However, the following docker service
command fails, resulting in
curl/chrome unable to connect (connection refused):
docker service create -p 5000:80 nginx
This issue is not apparent when provisioning 18.09.0 cloud VM’s using docker-machine.
Workarounds:
docker run
is unaffected.This issue is resolved in 18.09.1.
As of EE 2.1, Docker has deprecated support for Device Mapper as a storage driver. It will continue to be supported at this time, but support will be removed in a future release. Docker will continue to support Device Mapper for existing EE 2.0 and 2.1 customers. Please contact Sales for more information.
Docker recommends that existing customers migrate to using Overlay2 for the storage driver. The Overlay2 storage driver is now the default for Docker engine implementations.
As of EE 2.1, Docker has deprecated support for IBM Z (s390x). Refer to the Docker Compatibility Matrix for detailed compatibility information.
For more information on the list of deprecated flags and APIs, have a look at the deprecation information where you can find the target removal dates.
In this release, Docker has also removed support for TLS < 1.2 moby/moby#37660, Ubuntu 14.04 “Trusty Tahr” docker-ce-packaging#255 / docker-ce-packaging#254, and Debian 8 “Jessie” docker-ce-packaging#255 / docker-ce-packaging#254.
There are two ways to install and upgrade Docker Enterprise Edition (Docker EE) on CentOS:
This section lists what you need to consider before installing Docker EE. Items that require action are explained below.
x86_64
.overlay2
or devicemapper
(direct-lvm
mode in production)./etc/yum.repos.d/
.Docker EE supports CentOS 64-bit, latest version, running on x86_64
.
On CentOS, Docker EE supports storage drivers, overlay2
and
devicemapper
. In Docker EE 17.06.2-ee-5 and higher, overlay2
is the
recommended storage driver. The following limitations apply:
selinux
is enabled, the overlay2
storage driver is
supported on CentOS 7.4 or higher. If selinux
is disabled, overlay2
is supported on CentOS 7.2 or higher with kernel version 3.10.0-693 and
higher.devicemapper
, you must use
direct-lvm
mode, which requires one or more dedicated block devices.
Fast storage such as solid-state media (SSD) is recommended.To install Docker EE, you will need the URL of the Docker EE repository associated with your trial or subscription:
You will use this URL in a later step to create a variable called,
DOCKERURL
.
The Docker EE package is called docker-ee
. Older
versions were called docker
or docker-engine
. Uninstall all
older versions and associated dependencies. The contents of
/var/lib/docker/
are preserved, including images, containers,
volumes, and networks. If you are upgrading from Docker Engine -
Community to Docker EE, remove the Docker Engine -
Community package as well.
$ sudo yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
The advantage of using a repository from which to install Docker EE (or any software) is that it provides a certain level of automation. RPM-based distributions such as CentOS, use a tool called YUM that work with your repositories to manage dependencies and provide automatic updates.
You only need to set up the repository once, after which you can install Docker EE from the repo and repeatedly upgrade as necessary.
Remove existing Docker repositories from /etc/yum.repos.d/
:
$ sudo rm /etc/yum.repos.d/docker*.repo
Temporarily store the URL (that you copied above) in an environment
variable. Replace <DOCKER-EE-URL>
with your URL in the following
command. This variable assignment does not persist when the session ends:
$ export DOCKERURL="<DOCKER-EE-URL>"
Store the value of the variable, DOCKERURL
(from the previous
step), in a yum
variable in /etc/yum/vars/
:
$ sudo -E sh -c 'echo "$DOCKERURL/centos" > /etc/yum/vars/dockerurl'
Install required packages: yum-utils
provides the
yum-config-manager utility, and device-mapper-persistent-data
and lvm2
are required by the devicemapper storage driver:
$ sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
Add the Docker EE stable repository:
$ sudo -E yum-config-manager \
--add-repo \
"$DOCKERURL/centos/docker-ee.repo"
Install the latest patch release, or go to the next step to install a specific version:
$ sudo yum -y install docker-ee docker-ee-cli containerd.io
If prompted to accept the GPG key, verify that the fingerprint
matches 77FE DA13 1A83 1D29 A418 D3E8 99E5 FF2E 7668 2BC9
, and if so, accept it.
To install a specific version of Docker EE (recommended in production), list versions and install:
List and sort the versions available in your repo. This example sorts results by version number, highest to lowest, and is truncated:
$ sudo yum list docker-ee --showduplicates | sort -r
docker-ee.x86_64 19.03.ee.2-1.el7.entos docker-ee-stable-18.09
The list returned depends on which repositories you enabled, and is
specific to your version of CentOS (indicated by ``.el7`` in this
example).
Install a specific version by its fully qualified package name,
which is the package name (docker-ee
) plus the version string
(2nd column) starting at the first colon (:
), up to the first
hyphen, separated by a hyphen (-
). For example,
docker-ee-18.09.1
.
$ sudo yum -y install docker-ee-<VERSION_STRING> docker-ee-cli-<VERSION_STRING> containerd.io
For example, if you want to install the 18.09 version run the following:
sudo yum-config-manager --enable docker-ee-stable-18.09
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Start Docker:
Note
If using devicemapper
, ensure it is properly configured before
starting Docker.
$ sudo systemctl start docker
Verify that Docker EE is installed correctly by
running the hello-world
image. This command downloads a test
image, runs it in a container, prints an informational message, and
exits:
$ sudo docker run hello-world
Docker EE is installed and running. Use sudo
to
run Docker commands.
To manually install Docker Enterprise, download the .rpm
file for your
release. You need to download a new file each time you want to upgrade Docker
Enterprise.
Go to the Docker EE repository URL associated with
your trial or subscription in your browser. Go to
centos/7/x86_64/stable-<VERSION>/Packages
and download the .rpm
file
for the Docker version you want to install.
Install Docker Enterprise, changing the path below to the path where you downloaded the Docker package.
$ sudo yum install /path/to/package.rpm
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Start Docker:
Note
If using devicemapper
, ensure it is properly configured before
starting Docker.
$ sudo systemctl start docker
Verify that Docker EE is installed correctly by
running the hello-world
image. This command downloads a test
image, runs it in a container, prints an informational message, and
exits:
$ sudo docker run hello-world
Docker EE is installed and running. Use sudo
to
run Docker commands.
Uninstall the Docker EE package:
$ sudo yum -y remove docker-ee
Delete all images, containers, and volumes (because these are not automatically removed from your host):
$ sudo rm -rf /var/lib/docker
Delete other Docker related resources:
$ sudo rm -rf /run/docker
$ sudo rm -rf /var/run/docker
$ sudo rm -rf /etc/docker
If desired, remove the devicemapper
thin pool and reformat the
block devices that were part of it.
You must delete any edited configuration files manually.
There are two ways to install and upgrade :ref:`Docker Enterprise<docker-engine-enterprise> on Oracle Linux:
This section lists what you need to consider before installing Docker EE. Items that require action are explained below.
devicemapper
storage driver only (direct-lvm
mode in
production)./etc/yum.repos.d/
.Docker Engine - Enterprise supports Oracle Linux 64-bit, versions 7.3 and higher, running the Red Hat Compatible kernel (RHCK) 3.10.0-514 or higher. Older versions of Oracle Linux are not supported.
On Oracle Linux, Docker Engine - Enterprise only supports the
devicemapper
storage driver. In production, you must use it in
direct-lvm
mode, which requires one or more dedicated block devices.
Fast storage such as solid-state media (SSD) is recommended.
To install Docker EE, you will need the URL of the Docker EE repository associated with your trial or subscription:
You will use this URL in a later step to create a variable called,
DOCKERURL
.
The Docker Engine - Enterprise package is called docker-ee
. Older
versions were called docker
or docker-engine
. Uninstall all
older versions and associated dependencies. The contents of
/var/lib/docker/
are preserved, including images, containers,
volumes, and networks.
$ sudo yum remove docker \
docker-engine \
docker-engine-selinux
The advantage of using a repository from which to install Docker Engine - Enterprise (or any software) is that it provides a certain level of automation. RPM-based distributions such as Oracle Linux, use a tool called YUM that work with your repositories to manage dependencies and provide automatic updates.
You only need to set up the repository once, after which you can install Docker Engine - Enterprise from the repo and repeatedly upgrade as necessary.
Remove existing Docker repositories from /etc/yum.repos.d/
:
$ sudo rm /etc/yum.repos.d/docker*.repo
Temporarily store the URL (that you copied
above) in an environment variable.
Replace <DOCKER-EE-URL>
with your URL in the following command.
This variable assignment does not persist when the session ends:
$ export DOCKERURL="<DOCKER-EE-URL>"
Store the value of the variable, DOCKERURL
(from the previous
step), in a yum
variable in /etc/yum/vars/
:
$ sudo -E sh -c 'echo "$DOCKERURL/oraclelinux" > /etc/yum/vars/dockerurl'
Install required packages: yum-utils
provides the
yum-config-manager utility, and device-mapper-persistent-data
and lvm2
are required by the devicemapper storage driver:
$ sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
Enable the ol7_addons
Oracle repository. This ensures access to
the container-selinux
package required by docker-ee
.
$ sudo yum-config-manager --enable ol7_addons
Add the Docker Engine - Enterprise stable repository:
$ sudo -E yum-config-manager \
--add-repo \
"$DOCKERURL/oraclelinux/docker-ee.repo"
Install the latest patch release, or go to the next step to install a specific version:
$ sudo yum -y install docker-ee docker-ee-cli containerd.io
If prompted to accept the GPG key, verify that the fingerprint
matches 77FE DA13 1A83 1D29 A418 D3E8 99E5 FF2E 7668 2BC9
, and if so, accept it.
To install a specific version of Docker Engine - Enterprise (recommended in production), list versions and install:
List and sort the versions available in your repo. This example sorts results by version number, highest to lowest, and is truncated:
$ sudo yum list docker-ee --showduplicates | sort -r
docker-ee.x86_64 19.03.ee.2-1.el7.oraclelinuix docker-ee-stable-18.09
The list returned depends on which repositories you enabled, and is
specific to your version of Oracle Linux (indicated by
.el7
in this example).
Install a specific version by its fully qualified package name,
which is the package name (docker-ee
) plus the version string
(2nd column) starting at the first colon (:
), up to the first
hyphen, separated by a hyphen (-
). For example,
docker-ee-18.09.1
.
$ sudo yum -y install docker-ee-<VERSION_STRING> docker-ee-cli-<VERSION_STRING> containerd.io
For example, if you want to install the 18.09 version run the following:
sudo yum-config-manager --enable docker-ee-stable-18.09
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Start Docker:
Note
If using devicemapper
, ensure it is properly configured before
starting Docker.
$ sudo systemctl start docker
Verify that Docker Engine - Enterprise is installed correctly by
running the hello-world
image. This command downloads a test
image, runs it in a container, prints an informational message, and
exits:
$ sudo docker run hello-world
Docker Engine - Enterprise is installed and running. Use sudo
to
run Docker commands.
To manually install Docker Enterprise, download the
.rpm
file for your release. You need to
download a new file each time you want to upgrade Docker Enterprise.
Go to the Docker Engine - Enterprise repository URL associated with
your trial or subscription in your browser. Go to
oraclelinux/
. Choose your Oracle Linux
version, architecture, and Docker version. Download the
.rpm
file from the Packages
directory.
Install Docker Enterprise, changing the path below to the path where you downloaded the Docker package.
$ sudo yum install /path/to/package.rpm
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Start Docker:
Note
If using devicemapper
, ensure it is properly configured before
starting Docker.
$ sudo systemctl start docker
Verify that Docker Engine - Enterprise is installed correctly by
running the hello-world
image. This command downloads a test
image, runs it in a container, prints an informational message, and
exits:
$ sudo docker run hello-world
Docker Engine - Enterprise is installed and running. Use sudo
to
run Docker commands.
yum -y upgrade
instead of yum -y install
, and point to
the new file.Uninstall the Docker Engine - Enterprise package:
$ sudo yum -y remove docker-ee
Delete all images, containers, and volumes (because these are not automatically removed from your host):
$ sudo rm -rf /var/lib/docker
Delete other Docker related resources:
$ sudo rm -rf /run/docker
$ sudo rm -rf /var/run/docker
$ sudo rm -rf /etc/docker``
If desired, remove the devicemapper
thin pool and reformat the
block devices that were part of it.
You must delete any edited configuration files manually.
There are two ways to install and upgrade :ref:`Docker Enterprise<docker-engine-enterprise> on Red Hat Enterprise Linux:
This section lists what you need to consider before installing Docker EE. Items that require action are explained below.
x86_64
.overlay2
or devicemapper
(direct-lvm
mode in production)./etc/yum.repos.d/
.Docker EE supports Red Hat Enterprise Linux 64-bit,
versions 7.4 and higher running on x86_64
.
On Red Hat Enterprise Linux, Docker EE supports storage drivers, overlay2
and devicemapper
. In Docker EE 17.06.2-ee-5 and higher, overlay2
is the
recommended storage driver. The following limitations apply:
selinux
is enabled, the
overlay2
storage driver is supported on RHEL 7.4
or higher. If selinux
is disabled, overlay2
is supported on
RHEL 7.2 or higher with kernel version 3.10.0-693 and
higher.devicemapper
, you must use direct-lvm
mode, which requires
one or more dedicated block devices. Fast storage such as solid-state
media (SSD) is recommended.Federal Information Processing Standards (FIPS) Publication 140-2 is a United States Federal security requirement for cryptographic modules.
With Docker EE Basic license for versions 18.03 and later, Docker provides FIPS 140-2 support in RHEL 7.3, 7.4 and 7.5. This includes a FIPS supported cryptographic module. If the RHEL implementation already has FIPS support enabled, FIPS is also automatically enabled in the Docker engine.
To verify the FIPS 140-2 module is enabled in the Linux kernel, confirm
the file /proc/sys/crypto/fips_enabled
contains 1
.
$ cat /proc/sys/crypto/fips_enabled
1
Note
FIPS is only supported in Docker Engine EE. UCP and DTR currently do not have support for FIPS 140-2.
To enable FIPS 140-2 compliance on a system that is not in FIPS 140-2 mode, do the following:
Create a file called
/etc/systemd/system/docker.service.d/fips-module.conf
. Add the
following:
[Service]
Environment="DOCKER_FIPS=1"
Reload the Docker configuration to systemd.
$ sudo systemctl daemon-reload
Restart the Docker service as root.
$ sudo systemctl restart docker
To confirm Docker is running with FIPS 140-2 enabled, run the
docker info
command.
docker info --format {{.SecurityOptions}}
[name=selinux name=fips]
If the system has the FIPS 140-2 cryptographic module installed on the operating system, it is possible to disable FIPS 140-2 compliance.
To disable FIPS 140-2 in Docker but not the operating system, set the
value DOCKER_FIPS=0
in the
/etc/systemd/system/docker.service.d/fips-module.conf
.
Reload the Docker configuration to systemd.
$ sudo systemctl daemon-reload
Restart the Docker service as root.
$ sudo systemctl restart docker
To install Docker Enterprise, you will need the URL of the Docker Enterprise repository associated with your trial or subscription:
You will use this URL in a later step to create a variable called,
DOCKERURL
.
The Docker EE package is called docker-ee
. Older
versions were called docker
or docker-engine
. Uninstall all
older versions and associated dependencies. The contents of
/var/lib/docker/
are preserved, including images, containers,
volumes, and networks.
$ sudo yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
The advantage of using a repository from which to install Docker EE (or any software) is that it provides a certain level of automation. RPM-based distributions such as Red Hat Enterprise Linux, use a tool called YUM that work with your repositories to manage dependencies and provide automatic updates.
Disable SELinux before installing Docker EE 17.06.xx on IBM Z systems
There is currently
no support for selinux
on IBM Z systems. If you attempt > to install
or upgrade Docker EE on an IBM Z system with >
selinux
enabled, an error is thrown that the container-selinux
package is > not found. Disable selinux
before installing or
upgrading Docker on IBM Z.
You only need to set up the repository once, after which you can install Docker EE from the repo and repeatedly upgrade as necessary.
Remove existing Docker repositories from /etc/yum.repos.d/:
$ sudo rm /etc/yum.repos.d/docker*.repo
Temporarily store the URL (that you copied above) in an environment variable. Replace <DOCKER-EE-URL> with your URL in the following command. This variable assignment does not persist when the session ends:
$ export DOCKERURL="<DOCKER-EE-URL>"
Store the value of the variable, DOCKERURL (from the previous step), in a yum variable in /etc/yum/vars/:
$ sudo -E sh -c 'echo "$DOCKERURL/rhel" > /etc/yum/vars/dockerurl'
Also, store your OS version string in /etc/yum/vars/dockerosversion. Most users should use 7, but you can also use the more specific minor version, starting from 7.2.
$ sudo sh -c 'echo "7" > /etc/yum/vars/dockerosversion'
Install required packages: yum-utils provides the yum-config-manager utility, and device-mapper-persistent-data and lvm2 are required by the devicemapper storage driver:
$ sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
Enable the extras RHEL repository. This ensures access to the container-selinux package required by docker-ee.
The repository can differ per your architecture and cloud provider, so review the options in this step before running:
For all architectures except IBM Power:
$ sudo yum-config-manager --enable rhel-7-server-extras-rpms
For IBM Power only (little endian):
$ sudo yum-config-manager --enable extras
$ sudo subscription-manager repos --enable=rhel-7-for-power-le-extras-rpms
$ sudo yum makecache fast
$ sudo yum -y install container-selinux
Depending on cloud provider, you may also need to enable another repository:
For AWS (where REGION is a literal, and does not represent the region your machine is running in):
$ sudo yum-config-manager --enable rhui-REGION-rhel-server-extras
For Azure:
$ sudo yum-config-manager --enable rhui-rhel-7-server-rhui-extras-rpms
Add the Docker EE stable repository:
$ sudo -E yum-config-manager \
--add-repo \
"$DOCKERURL/rhel/docker-ee.repo"
Install the latest patch release, or go to the next step to install a specific version:
$ sudo yum -y install docker-ee docker-ee-cli containerd.io
If prompted to accept the GPG key, verify that the fingerprint matches
77FE DA13 1A83 1D29 A418 D3E8 99E5 FF2E 7668 2BC9
, and if so, accept it.
To install a specific version of Docker EE (recommended in production), list versions and install:
List and sort the versions available in your repo. This example sorts results by version number, highest to lowest, and is truncated:
$ sudo yum list docker-ee --showduplicates | sort -r
docker-ee.x86_64 19.03.ee.2-1.el7.rhel docker-ee-stable-18.09
The list returned depends on which repositories you enabled, and is
specific to your version of Red Hat Enterprise Linux (indicated by
.el7
in this example).
Install a specific version by its fully qualified package name,
which is the package name (docker-ee
) plus the version string
(2nd column) starting at the first colon (:
), up to the first
hyphen, separated by a hyphen (-
). For example,
docker-ee-18.09.1
.
$ sudo yum -y install docker-ee-<VERSION_STRING> docker-ee-cli-<VERSION_STRING> containerd.io
For example, if you want to install the 18.09 version run the following:
sudo yum-config-manager --enable docker-ee-stable-18.09
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Start Docker:
Note
If using devicemapper
, ensure it is properly configured before
starting Docker.
$ sudo systemctl start docker
Verify that Docker EE is installed correctly by
running the hello-world
image. This command downloads a test
image, runs it in a container, prints an informational message, and
exits:
$ sudo docker run hello-world
Docker EE is installed and running. Use sudo
to
run Docker commands.
To manually install Docker Enterprise, download the
.rpm
file for your release. You need to
download a new file each time you want to upgrade Docker EE.
Disable SELinux before installing Docker EE on IBM Z systems
There is currently no support for selinux
on IBM Z systems. If you
attempt to install or upgrade Docker EE on an IBM Z system with selinux
enabled, an error is thrown that the container-selinux
package is not
found. Disable selinux
before installing or upgrading Docker on IBM Z.
Enable the extras RHEL repository. This ensures access to the container-selinux package which is required by docker-ee:
$ sudo yum-config-manager --enable rhel-7-server-extras-rpms
Alternately, obtain that package manually from Red Hat. There is no way to publicly browse this repository.
Go to the Docker EE repository URL associated with your trial or subscription in your browser. Go to rhel/. Choose your Red Hat Enterprise Linux version, architecture, and Docker version. Download the .rpm file from the Packages directory.
Note
If you have trouble with selinux using the packages under the 7 directory, try choosing the version-specific directory instead, such as 7.3.
Install Docker EE, changing the path below to the path where you downloaded the Docker package.
$ sudo yum install /path/to/package.rpm
Docker is installed but not started. The docker group is created, but no users are added to the group.
Start Docker:
Note
If using devicemapper, ensure it is properly configured before starting Docker, per the storage guide.
$ sudo systemctl start docker
Verify that Docker EE is installed correctly by running the hello-world image. This command downloads a test image, runs it in a container, prints an informational message, and exits:
$ sudo docker run hello-world
Docker EE is installed and running. Use sudo to run Docker commands. See Linux postinstall to allow non-privileged users to run Docker commands.
Uninstall the Docker EE package:
$ sudo yum -y remove docker-ee
Delete all images, containers, and volumes (because these are not automatically removed from your host):
$ sudo rm -rf /var/lib/docker
Delete other Docker related resources:
$ sudo rm -rf /run/docker
$ sudo rm -rf /var/run/docker
$ sudo rm -rf /etc/docker
If desired, remove the devicemapper
thin pool and reformat the
block devices that were part of it.
Note
You must delete any edited configuration files manually.
To install Docker Engine - Enterprise (Docker EE), you need to know the Docker EE repository URL associated with your trial or subscription. These instructions work for Docker on SLES and for Docker on Linux, which includes access to Docker EE for all Linux distributions. To get this information, do the following:
Use this URL when you see the placeholder text <DOCKER-EE-URL>
.
To install Docker EE, you need the 64-bit version of
SLES 12.x, running on the x86_64
architecture, s390x
(IBM Z), or
ppc64le
(IBM Power) architectures. Docker EE is not supported on OpenSUSE.
The only supported storage driver for Docker EE on SLES
is Btrfs
, which is used by default if the underlying filesystem
hosting /var/lib/docker/
is a BTRFS filesystem.
Docker creates a DOCKER
iptables chain when it starts. The SUSE
firewall may block access to this chain, which can prevent you from
running containers with published ports. You may see errors such as the
following:
WARNING: IPv4 forwarding is disabled. Networking will not work.
docker: Error response from daemon: driver failed programming external
connectivity on endpoint adoring_ptolemy
(0bb5fa80bc476f8a0d343973929bb3b7c039fc6d7cd30817e837bc2a511fce97):
(iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 80 -j DNAT --to-destination 172.17.0.2:80 ! -i docker0: iptables: No chain/target/match by that name.
(exit status 1)).
If you see errors like this, adjust the start-up script order so that the firewall is started before Docker, and Docker stops before the firewall stops. See the SLES documentation on init script order.
Older versions of Docker were called docker
or docker-engine
. If
you use OS images from a cloud provider, you may need to remove the
runc
package, which conflicts with Docker. If these are installed,
uninstall them, along with associated dependencies.
$ sudo zypper rm docker docker-engine runc
If removal of the docker-engine
package fails, use the following
command instead:
$ sudo rpm -e docker-engine
It’s OK if zypper
reports that none of these packages are installed.
The contents of /var/lib/docker/
, including images, containers,
volumes, and networks, are preserved. The Docker EE
package is now called docker-ee
.
By default, SLES formats the /
filesystem using Btrfs, so most
people do not not need to do the steps in this section. If you use OS
images from a cloud provider, you may need to do this step. If the
filesystem that hosts /var/lib/docker/
is not a BTRFS
filesystem, you must configure a BTRFS filesystem and mount it on
/var/lib/docker/
.
Check whether /
(or /var/
or /var/lib/
or
/var/lib/docker/
if they are separate mount points) are formatted
using Btrfs. If you do not have separate mount points for any of
these, a duplicate result for /
is returned.
$ df -T / /var /var/lib /var/lib/docker
You need to complete the rest of these steps only if one of the following is true:
/var/
filesystem that is not formatted
with Btrfs/var/
or /var/lib/
or
/var/lib/docker/
filesystem and /
is not formatted with
BtrfsIf /var/lib/docker
is already a separate mount point and is not
formatted with Btrfs, back up its contents so that you can restore
them after step
Format your dedicated block device or devices as a Btrfs filesystem.
This example assumes that you are using two block devices called
/dev/xvdf
and /dev/xvdg
. Make sure you are using the right
device names.
Note
Double-check the block device names because this is a destructive operation.
$ sudo mkfs.btrfs -f /dev/xvdf /dev/xvdg
There are many more options for Btrfs, including striping and RAID. See the Btrfs documentation.
Mount the new Btrfs filesystem on the /var/lib/docker/
mount
point. You can specify any of the block devices used to create the
Btrfs filesystem.
$ sudo mount -t btrfs /dev/xvdf /var/lib/docker
Don’t forget to make the change permanent across reboots by adding an
entry to /etc/fstab
.
If /var/lib/docker
previously existed and you backed up its
contents during step 1, restore them onto /var/lib/docker
.
You can install Docker EE in different ways, depending on your needs.
Before you install Docker EE for the first time on a new host machine, you need to set up the Docker repository. Afterward, you can install and update Docker from the repository.
Temporarily add the $DOCKER_EE_BASE_URL
and $DOCKER_EE_URL
variables into your environment. This only persists until you log out
of the session. Replace <DOCKER-EE-URL>
listed below with the URL
you noted down in the prerequisites.
$ DOCKER_EE_BASE_URL="<DOCKER-EE-URL>"
$ DOCKER_EE_URL="${DOCKER_EE_BASE_URL}/sles/<SLES_VERSION>/<ARCH>/stable-<DOCKER_VERSION>"
Where:
DOCKER-EE-URL
is the URL from your Docker Hub subscription.ARCHITECTURE
is x86_64
, s390x
, or ppc64le
.ERSION
is 18.09
As an example, your command should look like:
DOCKER_EE_BASE_URL="https://storebits.docker.com/ee/sles/sub-555-55-555"
Use the following command to set up the stable repository. Use the command as-is. It works because of the variable you set in the previous step.
$ sudo zypper addrepo $DOCKER_EE_URL docker-ee-stable
Import the GPG key from the repository. Replace <DOCKER-EE-URL>
with the URL you noted down in the
prerequisites.
$ sudo rpm --import "${DOCKER_EE_BASE_URL}/sles/gpg"
Update the zypper
package index.
$ sudo zypper refresh
If this is the first time you have refreshed the package index since
adding the Docker repositories, you are prompted to accept the GPG
key, and the key’s fingerprint is shown. Verify that the fingerprint
matches 77FE DA13 1A83 1D29 A418 D3E8 99E5 FF2E 7668 2BC9
and if
so, accept the key.
Install the latest version of Docker EE and containerd, or go to the next step to install a specific version.
$ sudo zypper install docker-ee docker-ee-cli containerd.io
Start Docker.
$ sudo service docker start
On production systems, you should install a specific version of
Docker EE instead of always using the latest. List
the available versions. The following example only lists binary
packages and is truncated. To also list source packages, omit the
-t package
flag from the command.
$ zypper search -s --match-exact -t package docker-ee
Loading repository data...
Reading installed packages...
S | Name | Type | Version | Arch | Repository
--+---------------+---------+----------+--------+---------------
| docker-ee | package | 19.03-1 | x86_64 | docker-ee-stable
The contents of the list depend upon which repositories you have
enabled. Choose a specific version to install. The third column is
the version string. The fifth column is the repository name, which
indicates which repository the package is from and by extension its
stability level. To install a specific version, append the version
string to the package name and separate them by a hyphen (-
):
$ sudo zypper install docker-ee-<VERSION_STRING> docker-ee-cli-<VERSION_STRING> containerd.io
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Configure Docker to use the Btrfs filesystem. This is only required
if the ``/`` filesystem is not using BTRFS. However, explicitly
specifying the storage-driver
has no harmful side effects.
Edit the file /etc/docker/daemon.json
(create it if it does not
exist) and add the following contents:
{
"storage-driver": "btrfs"
}
Save and close the file.
Start Docker.
$ sudo service docker start
Verify that Docker is installed correctly by running the
hello-world
image.
$ sudo docker run hello-world
This command downloads a test image and runs it in a container. When the container runs, it prints an informational message and exits.
Docker EE is installed and running. You need to use
sudo
to run Docker commands.
Important
Be sure Docker is configured to start after the system firewall. See Firewall configuration.
To upgrade Docker EE:
sudo zypper refresh
.If you cannot use the official Docker repository to install Docker
EE, you can download the .rpm
file for your release
and install it manually. You need to download a new file each time you
want to upgrade Docker.
Go to the Docker EE repository URL associated with
your trial or subscription in your browser. Go to sles/12.3/
choose the directory corresponding to your architecture and desired
Docker EE version. Download the .rpm
file from
the Packages
directory.
Import Docker’s official GPG key.
$ sudo rpm --import <DOCKER-EE-URL>/sles/gpg
Install Docker EE, changing the path below to the path where you downloaded the Docker package.
$ sudo zypper install /path/to/package.rpm
Docker is installed but not started. The docker
group is created,
but no users are added to the group.
Configure Docker to use the Btrfs filesystem. This is only required
if the ``/`` filesystem is not using Btrfs. However, explicitly
specifying the storage-driver
has no harmful side effects.
Edit the file /etc/docker/daemon.json
(create it if it does not
exist) and add the following contents:
{
"storage-driver": "btrfs"
}
Save and close the file.
Start Docker.
$ sudo service docker start
Verify that Docker is installed correctly by running the
hello-world
image.
$ sudo docker run hello-world
This command downloads a test image and runs it in a container. When the container runs, it prints an informational message and exits.
Docker EE is installed and running. You need to use
sudo
to run Docker commands.
Important
Be sure Docker is configured to start after the system firewall. See Firewall configuration.
To upgrade Docker EE, download the newer package file
and repeat the installation procedure,
using zypper update
instead of zypper install
, and pointing to
the new file.
Uninstall the Docker EE package using the command below.
$ sudo zypper rm docker-ee
Images, containers, volumes, or customized configuration files on your host are not automatically removed. To delete all images, containers, and volumes.
$ sudo rm -rf /var/lib/docker/*
If you used a separate BTRFS filesystem to host the contents of
/var/lib/docker/
, you can unmount and format the Btrfs
filesystem.
You must delete any edited configuration files manually.
To get started with Docker EE on Ubuntu, make sure you meet the prerequisites, then install Docker.
To install Docker Enterprise Edition (Docker EE), you need to know the Docker EE repository URL associated with your trial or subscription. These instructions work for Docker EE for Ubuntu and for Docker EE for Linux, which includes access to Docker EE for all Linux distributions. To get this information:
Use this URL when you see the placeholder text <DOCKER-EE-URL>
.
To learn more about software requirements and supported storage drivers, check the compatibility matrix.
If your version supports the aufs storage driver, you need some preparation before installing Docker.
$ sudo apt-get remove docker docker-engine docker-ce docker-ce-cli docker.io
It’s OK if apt-get
reports that none of these packages are installed.
The contents of /var/lib/docker/
, including images, containers, volumes,
and networks, are preserved. The Docker EE package is now called docker-ee
.
If your version supports the aufs
storage driver, you need some preparation
before installing Docker.
You can install Docker EE in different ways, depending on your needs:
Before you install Docker EE for the first time on a new host machine, you need to set up the Docker repository. Afterward, you can install and update Docker EE from the repository.
Update the apt
package index.
$ sudo apt-get update
Install packages to allow apt
to use a repository over HTTPS.
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
Temporarily add a $DOCKER_EE_URL
variable into your environment. This
only persists only up until you log out of the session). Replace
<DOCKER-EE-URL>
with the URL you noted down in the
prerequisites.
$ DOCKER_EE_URL="<DOCKER-EE-URL>"
Temporarily add a $DOCKER_EE_VERSION
variable into your environment.
$ DOCKER_EE_VERSION=19.03
Add Docker’s official GPG key using your customer Docker Engine - Enterprise repository URL.
$ curl -fsSL "${DOCKER_EE_URL}/ubuntu/gpg" | sudo apt-key add -
Verify that you now have the key with the fingerprint
DD91 1E99 5A64 A202 E859 07D6 BC14 F10B 6D08 5F96
, by searching
for the last eight characters of the fingerprint. Use the command
as-is. It works because of the variable you set earlier.
$ sudo apt-key fingerprint 6D085F96
pub 4096R/0EBFCD88 2017-02-22
Key fingerprint = DD91 1E99 5A64 A202 E859 07D6 BC14 F10B 6D08 5F96
uid Docker Release (EE deb) <docker@docker.com>
sub 4096R/6D085F96 2017-02-22
Use the following command to set up the stable repository. Use the command as-is. It works because of the variable you set earlier.
$ sudo add-apt-repository \
"deb [arch=$(dpkg --print-architecture)] $DOCKER_EE_URL/ubuntu \
$(lsb_release -cs) \
stable-$DOCKER_EE_VERSION"
Update the apt
package index.
$ sudo apt-get update
Install the latest version of Docker EE and containerd, or go to the next step to install a specific version. Any existing installation of Docker EE is replaced.
$ sudo apt-get install docker-ee docker-ee-cli containerd.io
Warning
If you have multiple Docker repositories enabled, installing or
updating without specifying a version in the apt-get install
or apt-get update
command always installs the highest possible
version, which may not be appropriate for your stability needs.
{:.warning}
On production systems, you should install a specific version of Docker EE instead of always using the latest. The following output is truncated. List the available versions.
$ apt-cache madison docker-ee
docker-ee | 19.03.0~ee-0~ubuntu-xenial | <DOCKER-EE-URL>/ubuntu xenial/stable amd64 Packages
The contents of the list depend upon which repositories are enabled,
and are specific to your version of Ubuntu (indicated by the
xenial
suffix on the version, in this example). Choose a specific
version to install. The second column is the version string. The
third column is the repository name, which indicates which repository
the package is from and by extension its stability level. To install
a specific version, append the version string to the package name and
separate them by an equals sign (=
).
$ sudo apt-get install docker-ee=<VERSION_STRING> docker-ee-cli=<VERSION_STRING> containerd.io
The Docker daemon starts automatically.
Verify that Docker is installed correctly by running the
hello-world
image.
$ sudo docker run hello-world
This command downloads a test image and runs it in a container. When the container runs, it prints an informational message and exits.
Docker EE is installed and running. The docker
group is created but no users are added to it. You need to use sudo
to run Docker commands.
To upgrade Docker EE:
sudo apt-get update
.If you cannot use Docker’s repository to install Docker EE, you can download
the .deb
file for your release and install it manually. You need to
download a new file each time you want to upgrade Docker EE.
Go to the Docker EE repository URL associated with
your trial or subscription in your browser. Go to
ubuntu/x86_64/stable-<VERSION>
and download the .deb
file for the
Docker EE version and architecture you want to install.
Install Docker, changing the path below to the path where you downloaded the Docker EE package.
$ sudo dpkg -i /path/to/package.deb
The Docker daemon starts automatically.
Verify that Docker is installed correctly by running the
hello-world
image.
$ sudo docker run hello-world
This command downloads a test image and runs it in a container. When the container runs, it prints an informational message and exits.
Docker EE is installed and running. The docker
group is created but no users are added to it. You need to use sudo
to run Docker commands.
To upgrade Docker EE, download the newer package file and repeat the installation procedure, pointing to the new file.
Uninstall the Docker EE package.
$ sudo apt-get purge docker-ee
Images, containers, volumes, or customized configuration files on your host are not automatically removed. To delete all images, containers, and volumes:
$ sudo rm -rf /var/lib/docker
You must delete any edited configuration files manually.
Docker Engine - Enterprise enables native Docker containers on Windows Server. Windows Server 2016 and later versions are supported. The Docker Engine - Enterprise installation package includes everything you need to run Docker on Windows Server. This topic describes pre-install considerations, and how to download and install Docker Engine - Enterprise.
Windows OS requirements around specific CPU and RAM requirements also need to be met as specified in the Windows Server Requirements. This provides information for specific CPU and memory specs and capabilities (instruction sets like CMPXCHG16b, LAHF/SAHF, and PrefetchW, security: DEP/NX, etc.).
To install the Docker Engine - Enterprise on your hosts, Docker provides a OneGet PowerShell Module.
Open an elevated PowerShell command prompt, and type the following commands.
Install-Module DockerMsftProvider -Force
Install-Package Docker -ProviderName DockerMsftProvider -Force
Check if a reboot is required, and if yes, restart your instance.
(Install-WindowsFeature Containers).RestartNeeded
If the output of this command is Yes, then restart the server with:
Restart-Computer
Test your Docker Engine - Enterprise installation by running the
hello-world
container.
docker run hello-world:nanoserver
Unable to find image 'hello-world:nanoserver' locally
nanoserver: Pulling from library/hello-world
bce2fbc256ea: Pull complete
3ac17e2e6106: Pull complete
8cac44e17f16: Pull complete
5e160e4d8db3: Pull complete
Digest: sha256:25eac12ba40f7591969085ab3fb9772e8a4307553c14ea72d0e6f98b2c8ced9d
Status: Downloaded newer image for hello-world:nanoserver
Hello from Docker!
This message shows that your installation appears to be working correctly.
Some advanced Docker features, such as swarm mode, require the fixes included in KB4015217 (or a later cumulative patch).
sconfig
Select option 6) Download and Install Updates
.
Federal Information Processing Standards (FIPS) Publication 140-2 is a United States Federal security requirement for cryptographic modules.
With Docker Engine - Enterprise Basic license for versions 18.09 and later, Docker provides FIPS 140-2 support in Windows Server. This includes a FIPS supported cryptographic module. If the Windows implementation already has FIPS support enabled, FIPS is automatically enabled in the Docker engine.
Note
FIPS 140-2 is only supported in the Docker EE engine. UCP and DTR currently do not have support for FIPS 140-2.
To enable FIPS 140-2 compliance on a system that is not in FIPS 140-2 mode, execute the following command in PowerShell:
[System.Environment]::SetEnvironmentVariable("DOCKER_FIPS", "1", "Machine")
FIPS 140-2 mode may also be enabled via the Windows Registry. To update the pertinent registry key, execute the following PowerShell command as an Administrator:
Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\Lsa\FipsAlgorithmPolicy\" -Name "Enabled" -Value "1"
Restart the Docker service by running the following command.
net stop docker
net start docker
To confirm Docker is running with FIPS-140-2 enabled, run the
docker info
command:
Labels:
com.docker.security.fips=enabled
Note
If the system has the FIPS-140-2 cryptographic module
installed on the operating system, it is possible to disable
FIPS-140-2 compliance. To disable FIPS-140-2 in Docker but not the
operating system, set the value "DOCKER_FIPS","0"
in the
[System.Environment]
.``
Use the following guide if you wanted to install the Docker Engine - Enterprise manually, via a script, or on air-gapped systems.
In a PowerShell command prompt, download the installer archive on a machine that has a connection.
# On an online machine, download the zip file.
Invoke-WebRequest -UseBasicParsing -OutFile {{ filename }} {{ download_url }}
If you need to download a specific Docker Engine - Enterprise Engine release, all URLs can be found on this JSON index.
Copy the zip file to the machine where you want to install Docker. In a PowerShell command prompt, use the following commands to extract the archive, register, and start the Docker service.
# Stop Docker service
Stop-Service docker
# Extract the archive.
Expand-Archive {{ filename }} -DestinationPath $Env:ProgramFiles -Force
# Clean up the zip file.
Remove-Item -Force {{ filename }}
# Install Docker. This requires rebooting.
$null = Install-WindowsFeature containers
# Add Docker to the path for the current session.
$env:path += ";$env:ProgramFiles\docker"
# Optionally, modify PATH to persist across sessions.
$newPath = "$env:ProgramFiles\docker;" +
[Environment]::GetEnvironmentVariable("PATH",
[EnvironmentVariableTarget]::Machine)
[Environment]::SetEnvironmentVariable("PATH", $newPath,
[EnvironmentVariableTarget]::Machine)
# Register the Docker daemon as a service.
dockerd --register-service
# Start the Docker service.
Start-Service docker
Test your Docker Engine - Enterprise installation by running the
hello-world
container.
docker container run hello-world:nanoserver
To install a specific version, use the RequiredVersion
flag:
Install-Package -Name docker -ProviderName DockerMsftProvider -Force -RequiredVersion 19.03
...
Name Version Source Summary
---- ------- ------ -------
Docker 19.03 Docker Contains Docker Engine - Enterprise for use with Windows Server...
Installing specific Docker EE versions may require an update to previously installed DockerMsftProvider modules. To update:
Update-Module DockerMsftProvider
Then open a new PowerShell session for the update to take effect.
To update Docker Engine - Enterprise to the most recent release, specify
the -RequiredVersion
and -Update
flags:
Install-Package -Name docker -ProviderName DockerMsftProvider -RequiredVersion 19.03 -Update -Force
The required version number must match a version available on the JSON index
Use the following commands to completely remove the Docker Engine - Enterprise from a Windows Server:
Leave any active Docker Swarm.
docker swarm leave --force
Remove all running and stopped containers.
docker rm -f $(docker ps --all --quiet)
Prune container data.
docker system prune --all --volumes
Uninstall Docker PowerShell Package and Module.
Uninstall-Package -Name docker -ProviderName DockerMsftProvider
Uninstall-Module -Name DockerMsftProvider
Clean up Windows Networking and file system.
Get-HNSNetwork | Remove-HNSNetwork
Remove-Item -Path "C:\ProgramData\Docker" -Recurse -Force
To add a Windows Server host to an existing Universal Control Plane cluster please follow the list of `prerequisites and joining instructions.
Looking for information on using Docker Engine - Enterprise containers?
Universal Control Plane (UCP) is the enterprise-grade cluster management solution from Docker. You install it on-premises or in your virtual private cloud, and it helps you manage your Docker cluster and applications through a single interface.
Centralized cluster management
With Docker, you can join up to thousands of physical or virtual machines together to create a container cluster that allows you to deploy your applications at scale. UCP extends the functionality provided by Docker to make it easier to manage your cluster from a centralized place.
You can manage and monitor your container cluster using a graphical UI.
Deploy, manage, and monitor
With UCP, you can manage from a centralized place all of the computing resources you have available, like nodes, volumes, and networks.
You can also deploy and monitor your applications and services.
Built-in security and access control
UCP has its own built-in authentication mechanism and integrates with LDAP services. It also has role-based access control (RBAC), so that you can control who can access and make changes to your cluster and applications
UCP integrates with Docker Trusted Registry (DTR) so that you can keep the Docker images you use for your applications behind your firewall, where they are safe and can’t be tampered with.
You can also enforce security policies and only allow running applications that use Docker images you know and trust.
Use through the Docker CLI client
Because UCP exposes the standard Docker API, you can continue using the tools you already know, including the Docker CLI client, to deploy and manage your applications.
For example, you can use the docker info command to check the status of a cluster that’s managed by UCP:
docker info
This command produces the output that you expect from Docker Enterprise:
Containers: 38
Running: 23
Paused: 0
Stopped: 15
Images: 17
Server Version: 17.06
...
Swarm: active
NodeID: ocpv7el0uz8g9q7dmw8ay4yps
Is Manager: true
ClusterID: tylpv1kxjtgoik2jnrg8pvkg6
Managers: 1
…
Here you can learn about new features, bug fixes, breaking changes, and known issues for the latest UCP version. You can then use the upgrade instructions to upgrade your installation to the latest release.
(2020-11-12)
Component | Version |
---|---|
UCP | 3.1.16 |
Kubernetes | 1.11.10 |
Calico | 3.8.9 |
Interlock (nginx) | 1.14.2 |
(2020-08-10)
Component | Version |
---|---|
UCP | 3.1.15 |
Kubernetes | 1.11.10 |
Calico | 3.8.9 |
Interlock (nginx) | 1.14.2 |
Starting with this release, we moved the location of our offline bundles for DTR from https://packages.docker.com/caas/ to https://packages.mirantis.com/caas/ for the following versions.
Offline bundles for other previous versions of DTR will remain on the docker domain.
Due to infrastructure changes, licenses will no longer auto-update and the relaged screens in DTR have been removed.
Added tracing to Interlock (ENGORC-7565).
2020-06-24
Component | Version |
---|---|
UCP | 3.1.14 |
Kubernetes | 1.11.10 |
Calico | 3.8.9 |
Interlock | 3.1.3 |
Interlock NGINX proxy | 1.14.2 |
Golang | 1.13.8 |
2020-03-10
Component | Version |
---|---|
UCP | 3.1.13 |
Kubernetes | 1.11.10 |
Calico | 3.8.2 |
Interlock | 3.0.0 |
Interlock NGINX proxy | 1.14.2 |
Golang | 1.13.8 |
2019-11-14
VolumesFrom
Containers option. Previously, this field was ignored
by the container create request parser, leading to a gap in
permissions checks. (ENGORC-2781)Component | Version |
---|---|
UCP | 3.1.12 |
Kubernetes | 1.11.10 |
Calico | 3.8.2 |
Interlock | 3.0.0 |
Interlock NGINX proxy | 1.14.2 |
2019-10-08
true
, the proxy service
no longer needs to restart when services are updated, reducing
service interruptions. The proxy also does not have to restart
when services are added or removed, as long as the set of
service networks attached to the proxy is unchanged. If secrets
or service networks need to be added or removed, the proxy
service will restart as in previous releases. (ENGCORE-792)com.docker.lb.network
label does not match
any of the networks to which the service is attached.
(ENGCORE-837)HTTPVersion
is invalid. (FIELD-2046)Component | Version |
---|---|
UCP | 3.1.11 |
Kubernetes | 1.11.10 |
Calico | 3.8.2 |
Interlock | 3.0.0 |
Interlock NGINX proxy | 1.14.2 |
2019-09-03
Component | Version |
---|---|
UCP | 3.1.10 |
Kubernetes | 1.11.10 |
Calico | 3.8.2 |
Interlock | 2.6.1 |
Interlock NGINX proxy | 1.14.2 |
2019-07-17
robots.txt
to the root of the UCP API server.cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.9 |
Kubernetes | 1.11.10 |
Calico | 3.5.3 |
Interlock (NGINX) | 1.14.0 |
(2019-06-27)
Important
UCP 3.1.8 introduces new features such as setting the
kubeletMaxPods
option for all nodes in the cluster, and an
updated UCP configuration file that allows admins to set default
values for Swarm services. These features not available in UCP 3.2.0.
Customers using either of those features in UCP 3.1.8 or future
versions of 3.1.x must upgrade to UCP 3.2.1 or later to avoid any
upgrade issues.
user_workload_defaults
section has been added to the UCP
configuration file that allows admins to set default field values
that will be applied to Swarm services if those fields are not
explicitly set when the service is created. Only a subset of Swarm
service fields may be set.kubeletMaxPods
option for all nodes in the
cluster. (ENGORC-2334)10.96.0.0/16
at install time. for more details.
(ENGCORE-683)pods/exec
and
pods/attach
Kubernetes subresource from the migrated UCP
View-Only role. (ENGORC-2434)3.1.4
to 3.1.5
causes missing Swarm
placement constraints banner for some Swarm services (ENGORC-2191).
This can cause Swarm services to run unexpectedly on Kubernetes
nodes. See https://www.docker.com/ddc-41 for more information.ucp-*-s390x
Swarm services. For
example, ucp-auth-api-s390x
.cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.8 |
Kubernetes | 1.11.10 |
Calico | 3.5.3 |
Interlock (nginx) | 1.14.0 |
2019-05-06
3.1.4
to 3.1.5
causes missing Swarm
placement constraints banner for some Swarm services (ENGORC-2191).
This can cause Swarm services to run unexpectedly on Kubernetes
nodes.ucp-*-s390x
Swarm services. For
example, ucp-auth-api-s390x
.cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.7 |
Kubernetes | 1.11.9 |
Calico | 3.5.3 |
Interlock (nginx) | 1.14.0 |
2019-04-11
ListAccount
API endpoint now requires an admin
user. Accessing the GetAccount
API endpoint now requires an admin
user, the actual user, or a member of the organization being
inspected.
ENGORC-1003.1.4
to 3.1.5
causes missing Swarm
placement constraints banner for some Swarm services (ENGORC-2191).
This can cause Swarm services to run unexpectedly on Kubernetes
nodes. See https://www.docker.com/ddc-41 for more information.ucp-*-s390x
Swarm services. For
example, ucp-auth-api-s390x
.cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.6 |
Kubernetes | 1.11.9 |
Calico | 3.5.3 |
Interlock (nginx) | 1.14.0 |
2019-03-28
exclude_server_identity_headers
field to the UCP
config. If set to true, the headers are not included in UCP API
responses. (docker/orca#16039)update-action-failure
to
rollback. (ENGCORE-117)ucp-interlock
service image does not match expected version.
(ENGORC-2081)ucp-*-s390x
Swarm services. For
example, ucp-auth-api-s390x
.cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.5 |
Kubernetes | 1.11.8 |
Calico | 3.5.2 |
Interlock (nginx) | 1.14.0 |
2019-02-28
cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (docker/orca#14483)/var
partition.Component | Version |
---|---|
UCP | 3.1.4 |
Kubernetes | 1.11.7 |
Calico | 3.5.0 |
Interlock (nginx) | 1.14.0 |
2019-01-29
PersistentVolumes
using the
Local
Storage Class, as this allowed non-admins to by pass
security controls and mount host directories. (docker/orca#15936)docker ps
. (docker/orca#15812)User
or Group
subjects requiring the ID
of the user, organization, or team. (docker/orca#14935)Component | Version |
---|---|
UCP | 3.1.3 |
Kubernetes | 1.11.5 |
Calico | 3.5.0 |
Interlock (nginx) | 1.14.0 |
2019-01-09
com.docker.lb.network
label is not correctly specified.
(docker/orca#15015)Component | Version |
---|---|
UCP | 3.1.2 |
Kubernetes | 1.11.5 |
Calico | 3.2.3 |
Interlock (nginx) | 1.14.0 |
2018-12-04
Component | Version |
---|---|
UCP | 3.1.1 |
Kubernetes | 1.11.5 |
Calico | 3.2.3 |
Interlock (nginx) | 1.13.12 |
2018-11-08
ucp-auth
services (#14539)docker network ls --filter id=<id>
now works with a UCP client
bundle (#14840)Role
and ClusterRole
objects in the Kubernetes
API. They can also grant permissions to users and service accounts
with the RoleBinding
and ClusterRoleBinding
objects. The web
interface for Kubernetes RBAC reflects these changes. Your old
Kubernetes grants and roles will be automatically migrated during the
UCP upgrade.Admins can now enable audit logging in the UCP config. This logs all
incoming user-initiated requests in the ucp-controller
logs. Admins
can choose whether to log only metadata for incoming requests or the
full request body as well.
Admins can configure UCP to use a SAML-enabled identity provider for user authentication. If enabled, users who log into the UCP web interface are redirected to the identity provider’s website to log in. Upon login, users are redirected back to the UCP web interface, authenticated as the user chosen.
ucp-metrics
Prometheus server (used to render charts in the
UCP interface) was engineered from a container on manager nodes to a
Kubernetes daemonset. This lets admins change the daemonset’s
scheduling rules so that it runs on a set of worker nodes instead of
manager nodes. Admins can designate certain UCP nodes to be metrics
server nodes, freeing up resources on manager nodes./metricsdiscovery
endpoint so users can
connect their own Prometheus instances to scrape UCP metrics data.custom_api_server_headers
field in the UCP
configuration to set arbitrary headers that are included with every
UCP response.There are several backward-incompatible changes in the Kubernetes API that may affect user workloads. They are:
allowPrivilegeEscalation
field
that caused policies to start denying pods they previously allowed
was fixed. If you defined PodSecurityPolicy
objects using a 1.8.0
client or server and set allowPrivilegeEscalation
to false, these
objects must be reapplied after you upgrade.node.alpha.kubernetes.io/notReady
to
node.kubernetes.io/not-ready
node.alpha.kubernetes.io/unreachable
to
node.kubernetes.io/unreachable
For more information about
taints and tolerations, see Taints and
Tolerations.kubectl create -f pod.json
containing fields with incorrect casing are no longer valid. You must
correct these files before upgrading. When specifying keys in JSON
resource definitions during direct API server communication, the keys
are case-sensitive. A bug introduced in Kubernetes 1.8 caused the API
server to accept a request with incorrect case and coerce it to
correct case, but this behavior has been fixed in 1.11 so the API
server will again enforce correct casing. During this time, the
kubectl
tool continued to enforce case-sensitive keys, so users
that strictly manage resources with kubectl
will be unaffected by
this change.User
or Group
subjects. (#14935)User
subject Kind, the Name
field contains the ID
of the user.Group
subject Kind, the format depends on whether you
are create a Binding for a team or an organization:org:{org-id}
team:{org-id}:{team-id}
cluster-admin
as the ClusterRole.
Restricted Parameters on Containers include:cluster-admin
, restart the ucp-kube-apiserver
container
on any manager node to recreate them. (#14483)/var
partition.The following features are deprecated in UCP 3.1.
/Swarm/
collection is now deprecated and
will not be included in future versions of the product. However,
current nested collections with more than 2 layers are still
retained./Swarm/
. For example, if a production collection
called /Swarm/production
is created under the shared cluster
collection, /Swarm/
, then only one level of nesting should be
created: /Swarm/production/app/
.--cni-install-url
is deprecated in favor of
--unmanaged-cni
Component | Version |
---|---|
UCP | 3.1.0 |
Kubernetes | 1.11.2 |
Calico | 3.2.3 |
Interlock (nginx) | 1.13.12 |
Universal Control Plane (UCP) is a containerized application that runs on Docker Enterprise, extending its functionality to simplify the deployment, configuration, and monitoring of your applications at scale.
UCP also secures Docker with role-based access control (RBAC) so that only authorized users can make changes and deploy applications to your Docker cluster.
Once the UCP instance is deployed, developers and IT operations no longer interact with Docker Engine directly, but interact with UCP instead. Since UCP exposes the standard Docker API, this is all done transparently, so that you can use the tools you already know and love, like the Docker CLI client and Docker Compose.
UCP leverages the clustering and orchestration functionality provided by Docker.
A swarm is a collection of nodes that are in the same Docker cluster. Nodes in a Docker swarm operate in one of two modes: manager or worker. If nodes are not already running in a swarm when installing UCP, nodes will be configured to run in swarm mode.
When you deploy UCP, it starts running a globally scheduled service
called ucp-agent
. This service monitors the node where it’s running
and starts and stops UCP services, based on whether the node is a
manager or a worker node.
If the node is a:
ucp-agent
service automatically starts servingall UCP components, including the UCP web UI and data stores used by
UCP. The ucp-agent
accomplishes this by deploying several
containers on the node. By
promoting a node to manager, UCP automatically becomes highly
available and fault tolerant.
ucp-agent
service starts servinga proxy service that ensures only authorized users and other UCP
services can run Docker commands in that node. The ucp-agent
deploys a subset of containers on worker nodes.
The core component of UCP is a globally scheduled service called
ucp-agent
. When you install UCP on a node, or join a node to a swarm
that’s being managed by UCP, the ucp-agent
service starts running on
that node.
Once this service is running, it deploys containers with other UCP components, and it ensures they keep running. The UCP components that are deployed on a node depend on whether the node is a manager or a worker.
Note
Regarding OS-specific component names, some UCP component names
depend on the node’s operating system. For example, on Windows, the
ucp-agent
component is named ucp-agent-win
.
Manager nodes run all UCP services, including the web UI and data stores that persist the state of UCP. The following table shows the UCP services running on manager nodes.
UCP component | Description |
---|---|
k8s_calico-kube-controllers |
A cluster-scoped Kubernetes controller used to coordinate Calico networking. Runs on one manager node only. |
k8s_calico-node |
The Calico node agent, which coordinates networking fabric according
to the cluster-wide Calico configuration. Part of the calico-node
daemonset. Runs on all nodes. Configure the container network interface
(CNI) plugin using the --cni-installer-url flag. If this flag isn’t
set, UCP uses Calico as the default CNI plugin. |
k8s_install-cni_calico-node |
A container that’s responsible for installing the Calico CNI plugin binaries and configuration on each host. Part of the calico-node daemonset. Runs on all nodes. |
k8s_POD_calico-node |
Pause container for the calico-node pod. |
k8s_POD_calico-kube-controllers |
Pause container for the calico-kube-controllers pod. |
k8s_POD_compose |
Pause container for the compose pod. |
k8s_POD_kube-dns |
Pause container for the kube-dns pod. |
k8s_ucp-dnsmasq-nanny |
A dnsmasq instance used in the Kubernetes DNS Service. Part of the
kube-dns
deployment. Runs on one manager node only. |
k8s_ucp-kube-compose |
A custom Kubernetes resource component that’s responsible for
translating Compose files into Kubernetes constructs. Part of the
compose deployment. Runs on one manager node only. |
k8s_ucp-kube-dns |
The main Kubernetes DNS Service, used by pods to resolve service names.
Part of the kube-dns deployment. Runs on one manager node only.
Provides service discovery for Kubernetes services and pods. A set of
three containers deployed
via Kubernetes as a single pod. |
k8s_ucp-kubedns-sidecar |
Health checking and metrics daemon of the Kubernetes DNS Service. Part
of the kube-dns deployment. Runs on one manager node only. |
ucp-agent |
Monitors the node and ensures the right UCP services are running. |
ucp-auth-api |
The centralized service for identity and authentication used by UCP and DTR. |
ucp-auth-store |
Stores authentication configurations and data for users, organizations, and teams. |
ucp-auth-worker |
Performs scheduled LDAP synchronizations and cleans authentication and authorization data. |
ucp-client-root-ca |
A certificate authority to sign client bundles. |
ucp-cluster-root-ca |
A certificate authority used for TLS communication between UCP components. |
ucp-controller |
The UCP web server. |
ucp-dsinfo |
Docker system information collection script to assist with troubleshooting. |
ucp-interlock |
Monitors swarm workloads configured to use Layer 7 routing. Only runs when you enable Layer 7 routing. |
ucp-interlock-proxy |
A service that provides load balancing and proxying for swarm workloads. Only runs when you enable Layer 7 routing. |
ucp-kube-apiserver |
A master component that serves the Kubernetes API. It persists its state
in etcd directly , and all other components communicate with the API
server directly. The Kubernetes API server is configured to encrypt
Secrets using AES-CBC with a 256-bit key. The encryption key is never
rotated, and the encryption key is stored in a file on disk on manager
nodes. |
ucp-kube-controller-manager |
A master component that manages the desired state of controllers and other Kubernetes objects. It monitors the API server and performs background tasks when needed. |
ucp-kubelet |
The Kubernetes node agent running on every node, which is responsible for running Kubernetes pods, reporting the health of the node, and monitoring resource usage. |
ucp-kube-proxy |
The networking proxy running on every node, which enables pods to contact Kubernetes services and other pods, via cluster IP addresses. |
ucp-kube-scheduler |
A master component that handles scheduling of pods. It communicates with the API server only to obtain workloads that need to be scheduled. |
ucp-kv |
Used to store the UCP configurations. Don’t use it in your applications, since it’s for internal use only. Also used by Kubernetes components. |
ucp-metrics |
Used to collect and process metrics for a node, like the disk space available. |
ucp-proxy |
A TLS proxy. It allows secure access to the local Docker Engine to UCP components. |
ucp-reconcile |
When ucp-agent detects that the node is not running the right UCP
components, it starts the ucp-reconcile container to converge the node
to its desired state. It is expected for the ucp-reconcile container to
remain in an exited state when the node is healthy. |
ucp-swarm-manager |
Used to provide backwards-compatibility with Docker Swarm. |
Applications run on worker nodes. The following table shows the UCP services running on worker nodes.
UCP component | Description |
---|---|
k8s_calico-node |
A cluster-scoped Kubernetes controller used to coordinate Calico networking. Runs on one manager node only. |
k8s_install-cni_calico-node |
A container that’s responsible for installing the Calico CNI plugin
binaries and configuration on each host. Part of the calico-node
daemonset. Runs on all nodes. |
k8s_POD_calico-node |
Pause container for the Calico-node pod. By default, this container
is hidden, but you can see it by running docker ps -a . |
ucp-agent |
Monitors the node and ensures the right UCP services are running |
ucp-interlock-extension |
Helper service that reconfigures the ucp-interlock-proxy service based on the swarm workloads that are running. |
ucp-interlock-proxy |
A service that provides load balancing and proxying for swarm workloads. Only runs when you enable Layer 7 routing. |
ucp-dsinfo |
Docker system information collection script to assist with troubleshooting. |
ucp-kubelet |
The kubernetes node agent running on every node, which is responsible for running Kubernetes pods, reporting the health of the node, and monitoring resource usage. |
ucp-kube-proxy |
The networking proxy running on every node, which enables pods to contact Kubernetes services and other pods, via cluster IP addresses. |
ucp-reconcile |
When ucp-agent detects that the node is not running the right UCP components, it starts the ucp-reconcile container to converge the node to its desired state. It is expected for the ucp-reconcile container to remain in an exited state when the node is healthy. |
ucp-proxy |
A TLS proxy. It allows secure access to the local Docker Engine to UCP components. |
Every pod in Kubernetes has a pause container, which is an “empty” container that bootstraps the pod to establish all of the namespaces. Pause containers hold the cgroups, reservations, and namespaces of a pod before its individual containers are created. The pause container’s image is always present, so the allocation of the pod’s resources is instantaneous.
By default, pause containers are hidden, but you can see them by running
docker ps -a
.
docker ps -a | grep -I pause
8c9707885bf6 dockereng/ucp-pause:3.0.0-6d332d3 "/pause" 47 hours ago Up 47 hours k8s_POD_calico-kube-controllers-559f6948dc-5c84l_kube-system_d00e5130-1bf4-11e8-b426-0242ac110011_0
258da23abbf5 dockereng/ucp-pause:3.0.0-6d332d3 "/pause" 47 hours ago Up 47 hours k8s_POD_kube-dns-6d46d84946-tqpzr_kube-system_d63acec6-1bf4-11e8-b426-0242ac110011_0
2e27b5d31a06 dockereng/ucp-pause:3.0.0-6d332d3 "/pause" 47 hours ago Up 47 hours k8s_POD_compose-698cf787f9-dxs29_kube-system_d5866b3c-1bf4-11e8-b426-0242ac110011_0
5d96dff73458 dockereng/ucp-pause:3.0.0-6d332d3 "/pause" 47 hours ago Up 47 hours k8s_POD_calico-node-4fjgv_kube-system_d043a0ea-1bf4-11e8-b426-0242ac110011_0
UCP uses the following named volumes to persist data in all nodes where it runs.
Volume name | Description |
---|---|
ucp-auth-api-certs |
Certificate and keys for the authentication and authorization service |
ucp-auth-store-certs |
Certificate and keys for the authentication and authorization store |
ucp-auth-store-data |
Data of the authentication and authorization store, replicated across managers |
ucp-auth-worker-certs |
Certificate and keys for authentication worker |
ucp-auth-worker-data |
Data of the authentication worker |
ucp-client-root-ca |
Root key material for the UCP root CA that issues client certificates |
ucp-cluster-root-ca |
Root key material for the UCP root CA that issues certificates for swarm members |
ucp-controller-client-certs |
Certificate and keys used by the UCP web server to communicate with other UCP components |
ucp-controller-server-certs |
Certificate and keys for the UCP web server running in the node |
ucp-kv |
UCP configuration data, replicated across managers |
ucp-kv-certs |
Certificates and keys for the key-value store |
ucp-metrics-data |
Monitoring data gathered by UCP |
ucp-metrics-inventory |
Configuration file used by the ucp-metrics service |
ucp-node-certs |
Certificate and keys for node communication |
You can customize the volume driver used for these volumes, by creating the volumes before installing UCP. During the installation, UCP checks which volumes don’t exist in the node, and creates them using the default volume driver.
By default, the data for these volumes can be found at
/var/lib/docker/volumes/<volume-name>/_data
.
The following table shows the configurations used by UCP.
Configuration name | Description |
---|---|
com.docker.interlock.extension |
Configuration for the Interlock extension service that monitors and configures the proxy service |
com.docker.interlock.proxy |
Configuration for the service responsible for handling user requests and routing them |
com.docker.license |
Docker Enterprise license |
com.docker.ucp.interlock.conf |
Configuration for the core Interlock service |
There are two ways to interact with UCP: the web UI or the CLI.
You can use the UCP web UI to manage your swarm, grant and revoke user permissions, deploy, configure, manage, and monitor your applications.
UCP also exposes the standard Docker API, so you can continue using existing tools like the Docker CLI client. Since UCP secures your cluster with RBAC, you need to configure your Docker CLI client and other client tools to authenticate your requests using client certificates that you can download from your UCP profile page.
Universal Control Plane can be installed on-premises or on the cloud. Before installing, be sure your infrastructure has these requirements.
You can install UCP on-premises or on a cloud provider. Common requirements:
/var
partition for manager nodes
(A minimum of 6GB is recommended.)/var
partition for worker nodesNote
Increased storage is required for Kubernetes manager nodes in UCP 3.1.
Note that Windows container images are typically larger than Linux container images. For this reason, you should provision more local storage for Windows nodes and for any DTR setups that store Windows container images.
Also, make sure the nodes are running an operating system supported by Docker Enterprise.
For highly-available installations, you also need a way to transfer files between hosts.
Note
Workloads on manager nodes
Docker does not support workloads other than those required for UCP on UCP manager nodes.
When installing UCP on a host, a series of ports need to be opened to incoming traffic. Each of these ports will expect incoming traffic from a set of hosts, indicated as the “Scope” of that port. The three scopes are: - External: Traffic arrives from outside the cluster through end-user interaction. - Internal: Traffic arrives from other hosts in the same cluster. - Self: Traffic arrives to that port only from processes on the same host.
Note
When installing UCP on Microsoft Azure, an overlay network is not used for Kubernetes; therefore, any containerized service deployed onto Kubernetes and exposed as a Kubernetes Service may need its corresponding port to be opened on the underlying Azure Network Security Group.
Make sure the following ports are open for incoming traffic on the respective host types:
Hosts | Port | Scope | Purpose |
---|---|---|---|
managers, workers | TCP 179 | Internal | Port for BGP peers, used for Kubernetes networking |
managers | TCP 443 (configurable) | External, Internal | Port for the UCP web UI and API |
managers | TCP 2376 (configurable) | Internal | Port for the Docker Swarm manager. Used for backwards compatibility |
managers | TCP 2377 (configurable) | Internal | Port for control communication between swarm nodes |
managers, workers | UDP 4789 | Internal | Port for overlay networking |
managers | TCP 6443 (configurable) | External, Internal | Port for Kubernetes API server endpoint |
managers, workers | TCP 6444 | Self | Port for Kubernetes API reverse proxy |
managers, workers | TCP, UDP 7946 | Internal | Port for gossip-based clustering |
managers, workers | TCP 9099 | Self | Port for calico health check |
managers, workers | TCP 10250 | Internal | Port for Kubelet |
managers, workers | TCP 12376 | Internal | Port for a TLS authentication proxy that provides access to the Docker Engine |
managers, workers | TCP 12378 | Self | Port for Etcd reverse proxy |
managers | TCP 12379 | Internal | Port for Etcd Control API |
managers | TCP 12380 | Internal | Port for Etcd Peer API |
managers | TCP 12381 | Internal | Port for the UCP cluster certificate authority |
managers | TCP 12382 | Internal | Port for the UCP client certificate authority |
managers | TCP 12383 | Internal | Port for the authentication storage backend |
managers | TCP 12384 | Internal | Port for the authentication storage backend for replication across managers |
managers | TCP 12385 | Internal | Port for the authentication service API |
managers | TCP 12386 | Internal | Port for the authentication worker |
managers | TCP 12388 | Internal | Internal Port for the Kubernetes API Server |
CLOUD_NETCONFIG_MANAGE
for SLES 15¶For SUSE Linux Enterprise Server 15 (SLES 15) installations, you must
disable CLOUD_NETCONFIG_MANAGE
prior to installing UCP.
1. In the network interface configuration file, `/etc/sysconfig/network/ifcfg-eth0`, set
```
CLOUD_NETCONFIG_MANAGE="no"
```
2. Run `service network restart`.
For overlay networks with encryption to work, you need to ensure that IP protocol 50 (Encapsulating Security Payload) traffic is allowed.
The default networking plugin for UCP is Calico, which uses IP Protocol Number 4 for IP-in-IP encapsulation.
If you’re deploying to AWS or another cloud provider, enable IP-in-IP traffic for your cloud provider’s security group.
Calico’s Kubernetes controllers can’t reach the Kubernetes API server unless connection tracking is enabled on the loopback interface. SLES disables connection tracking by default.
On each node in the cluster:
sudo mkdir -p /etc/sysconfig/SuSEfirewall2.d/defaults
echo FW_LO_NOTRACK=no | sudo tee /etc/sysconfig/SuSEfirewall2.d/defaults/99-docker.cfg
sudo SuSEfirewall2 start
Make sure the networks you’re using allow the UCP components enough time to communicate before they time out.
Component | Timeout (ms) | Configurable |
---|---|---|
Raft consensus between manager nodes | 3000 | no |
Gossip protocol for overlay networking | 5000 | no |
etcd | 500 | yes |
RethinkDB | 10000 | no |
Stand-alone cluster | 90000 | no |
In distributed systems like UCP, time synchronization is critical to ensure proper operation. As a best practice to ensure consistency between the engines in a UCP cluster, all engines should regularly synchronize time with a Network Time Protocol (NTP) server. If a server’s clock is skewed, unexpected behavior may cause poor performance or even failures.
Docker Enterprise is a software subscription that includes three products:
UCP 3.1.8 requires minimum versions of the following Docker components:
Universal Control Plane (UCP) helps you manage your container cluster from a centralized place. This article explains what you need to consider before deploying UCP for production.
Before installing UCP, make sure that all nodes (physical or virtual machines) that you’ll manage with UCP:
UCP requires Docker Enterprise. Before installing Docker Enterprise on your cluster nodes, you should plan for a common hostname strategy.
Decide if you want to use short hostnames, like engine01
, or Fully
Qualified Domain Names (FQDN), like node01.company.example.com
.
Whichever you choose, confirm your naming strategy is consistent across
the cluster, because Docker Engine and UCP use hostnames.
For example, if your cluster has three hosts, you can name them:
node1.company.example.com
node2.company.example.com
node3.company.example.com
UCP requires each node on the cluster to have a static IPv4 address. Before installing UCP, ensure your network and nodes are configured to support this.
The following table lists recommendations to avoid IP range conflicts.
Component | Subnet | Range | Default IP address |
---|---|---|---|
Engine | default-address-pools |
CIDR range for interface and bridge networks | 172.17.0.0/16 - 172.30.0.0/16, 192.168.0.0/16 |
Swarm | default-addr-pool |
CIDR range for Swarm overlay networks | 10.0.0.0/8 |
Kubernetes | pod-cidr |
CIDR range for Kubernetes pods | 192.168.0.0/16 |
Kubernetes | service-cluster-ip-range |
CIDR range for Kubernetes services | 10.96.0.0/16 |
Two IP ranges are used by the engine for the docker0
and
docker_gwbridge
interface.
default-address-pools
defines a pool of CIDR ranges that are used to
allocate subnets for local bridge networks. By default the first available
subnet (172.17.0.0/16
) is assigned to docker0
and the next available
subnet (172.18.0.0/16
) is assigned to docker_gwbridge
. Both the
docker0
and docker_gwbridge
subnet can be modified by changing the
default-address-pools
value or as described in their individual sections
below.
The default value for default-address-pools
is:
{
"default-address-pools": [
{"base":"172.17.0.0/16","size":16}, <-- docker0
{"base":"172.18.0.0/16","size":16}, <-- docker_gwbridge
{"base":"172.19.0.0/16","size":16},
{"base":"172.20.0.0/16","size":16},
{"base":"172.21.0.0/16","size":16},
{"base":"172.22.0.0/16","size":16},
{"base":"172.23.0.0/16","size":16},
{"base":"172.24.0.0/16","size":16},
{"base":"172.25.0.0/16","size":16},
{"base":"172.26.0.0/16","size":16},
{"base":"172.27.0.0/16","size":16},
{"base":"172.28.0.0/16","size":16},
{"base":"172.29.0.0/16","size":16},
{"base":"172.30.0.0/16","size":16},
{"base":"192.168.0.0/16","size":20}
]
}
default-address-pools
: A list of IP address pools for local bridge
networks. Each entry in the list contain the following:
base
: CIDR range to be allocated for bridge networks.
size
: CIDR netmask that determines the subnet size to allocate from the
base pool
To offer an example, {"base":"192.168.0.0/16","size":20}
will allocate
/20
subnets from 192.168.0.0/16
yielding the following subnets for
bridge networks:192.168.0.0/20
(192.168.0.0
- 192.168.15.255
)192.168.16.0/20
(192.168.16.0
- 192.168.31.255
)192.168.32.0/20
(192.168.32.0
- 192.168.47.255
)192.168.48.0/20
(192.168.32.0
- 192.168.63.255
)192.168.64.0/20
(192.168.64.0
- 192.168.79.255
)…192.168.240.0/20
(192.168.240.0
- 192.168.255.255
)
Note
If the size
matches the netmask of the base
, then that pool only
containers one subnet.
For example, {"base":"172.17.0.0/16","size":16}
will only yield one
subnet 172.17.0.0/16
(172.17.0.0
- 172.17.255.255
).
By default, the Docker engine creates and configures the host system with a
virtual network interface called docker0
, which is an ethernet bridge
device. If you don’t specify a different network when starting a container, the
container is connected to the bridge and all traffic coming from and going to
the container flows over the bridge to the Docker engine, which handles routing
on behalf of the container.
Docker engine creates docker0
with a configurable IP range. Containers
which are connected to the default bridge are allocated IP addresses within
this range. Certain default settings apply to docker0
unless you specify
otherwise. The default subnet for docker0
is the first pool in
default-address-pools
which is 172.17.0.0/16
.
The recommended way to configqure the docker0
settings is to use the
daemon.json
file.
If only the subnet needs to be customized, it can be changed by modifying the
first pool of default-address-pools
in the daemon.json
file.
{
"default-address-pools": [
{"base":"172.17.0.0/16","size":16}, <-- Modify this value
{"base":"172.18.0.0/16","size":16},
{"base":"172.19.0.0/16","size":16},
{"base":"172.20.0.0/16","size":16},
{"base":"172.21.0.0/16","size":16},
{"base":"172.22.0.0/16","size":16},
{"base":"172.23.0.0/16","size":16},
{"base":"172.24.0.0/16","size":16},
{"base":"172.25.0.0/16","size":16},
{"base":"172.26.0.0/16","size":16},
{"base":"172.27.0.0/16","size":16},
{"base":"172.28.0.0/16","size":16},
{"base":"172.29.0.0/16","size":16},
{"base":"172.30.0.0/16","size":16},
{"base":"192.168.0.0/16","size":20}
]
}
Note
Modifying this value can also affect the docker_gwbridge
if the size
doesn’t match the netmask of the base
.
To configure a CIDR range and not rely on default-address-pools
, the
fixed-cidr
setting can used:
{
"fixed-cidr": "172.17.0.0/16",
}
fixed-cidr
: Specify the subnet for docker0
, using standard CIDR
notation. Default is 172.17.0.0/16
, the network gateway will be
172.17.0.1
and IPs for your containers will be allocated from
(172.17.0.2
- 172.17.255.254
).
To configure a gateway IP and CIDR range while not relying on
default-address-pools
, the bip
setting can used:
{
"bip": "172.17.0.0/16",
}
bip
: Specific a gateway IP address and CIDR netmask of the docker0
network. The notation is <gateway IP>/<CIDR netmask>
and the default is
172.17.0.1/16
which will make the docker0
network gateway
172.17.0.1
and subnet 172.17.0.0/16
.
The docker_gwbridge
is a virtual network interface that connects the
overlay networks (including the ingress
network) to an individual Docker
engine’s physical network. Docker creates it automatically when you initialize
a swarm or join a Docker host to a swarm, but it is not a Docker device. It
exists in the kernel of the Docker host. The default subnet for
docker_gwbridge
is the next available subnet in default-address-pools
which with defaults is 172.18.0.0/16
.
Note
If you need to customize the docker_gwbridge
settings, you must
do so before joining the host to the swarm, or after temporarily
removing the host from the swarm.
The recommended way to configure the docker_gwbridge
settings is to
use the daemon.json
file.
For docker_gwbridge
, the second available subnet will be allocated from
default-address-pools
. If any customizations where made to the docker0
interface it could affect which subnet is allocated. With the default
default-address-pools
settings you would modify the second pool.
{
"default-address-pools": [
{"base":"172.17.0.0/16","size":16},
{"base":"172.18.0.0/16","size":16}, <-- Modify this value
{"base":"172.19.0.0/16","size":16},
{"base":"172.20.0.0/16","size":16},
{"base":"172.21.0.0/16","size":16},
{"base":"172.22.0.0/16","size":16},
{"base":"172.23.0.0/16","size":16},
{"base":"172.24.0.0/16","size":16},
{"base":"172.25.0.0/16","size":16},
{"base":"172.26.0.0/16","size":16},
{"base":"172.27.0.0/16","size":16},
{"base":"172.28.0.0/16","size":16},
{"base":"172.29.0.0/16","size":16},
{"base":"172.30.0.0/16","size":16},
{"base":"192.168.0.0/16","size":20}
]
}
Swarm uses a default address pool of 10.0.0.0/8
for its overlay
networks. If this conflicts with your current network implementation,
please use a custom IP address pool. To specify a custom IP address
pool, use the --default-addr-pool
command line option during Swarm
initialization.
Note
The Swarm default-addr-pool
setting is separate from the Docker
engine default-address-pools
setting. They are two separate
ranges that are used for different purposes.
Note
Currently, the UCP installation process does not support this flag. To deploy with a custom IP pool, Swarm must first be initialized using this flag and UCP must be installed on top of it.
There are two internal IP ranges used within Kubernetes that may overlap and conflict with the underlying infrastructure:
192.168.0.0/16
range. This can
be customized at install time by passing the --pod-cidr
flag to
the UCP install command.10.96.0.0/16
. Beginning with 3.1.8,
this value can be changed at install time with the
--service-cluster-ip-range
flag.For SUSE Linux Enterprise Server 12 SP2 (SLES12), the FW_LO_NOTRACK
flag is turned on by default in the openSUSE firewall. This speeds up
packet processing on the loopback interface, and breaks certain firewall
setups that need to redirect outgoing packets via custom rules on the
local machine.
To turn off the FW_LO_NOTRACK option, edit the
/etc/sysconfig/SuSEfirewall2
file and set FW_LO_NOTRACK="no"
.
Save the file and restart the firewall or reboot.
For SUSE Linux Enterprise Server 12 SP3, the default value for
FW_LO_NOTRACK
was changed to no
.
For Red Hat Enterprise Linux (RHEL) 8, if firewalld is running and
FirewallBackend=nftables
is set in
/etc/firewalld/firewalld.conf
, change this to
FirewallBackend=iptables
, or you can explicitly run the following
commands to allow traffic to enter the default bridge (docker0) network:
firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --reload
In distributed systems like UCP, time synchronization is critical to ensure proper operation. As a best practice to ensure consistency between the engines in a UCP cluster, all engines should regularly synchronize time with a Network Time Protocol (NTP) server. If a host node’s clock is skewed, unexpected behavior may cause poor performance or even failures.
UCP doesn’t include a load balancer. You can configure your own load balancer to balance user requests across all manager nodes.
If you plan to use a load balancer, you need to decide whether you’ll add the nodes to the load balancer using their IP addresses or their FQDNs. Whichever you choose, be consistent across nodes. When this is decided, take note of all IPs or FQDNs before starting the installation.
By default, UCP and DTR both use port 443. If you plan on deploying UCP and DTR, your load balancer needs to distinguish traffic between the two by IP address or port number.
If you want to install UCP in a high-availability configuration that uses a
load balancer in front of your UCP controllers, include the appropriate IP
address and FQDN of the load balancer’s VIP by using one or more --san
flags in the UCP install command or when you’re asked
for additional SANs in interactive mode.
You can customize UCP to use certificates signed by an external Certificate Authority. When using your own certificates, you need to have a certificate bundle that has:
You can have a certificate for each manager, with a common SAN. For example, on a three-node cluster, you can have:
You can also install UCP with a single externally-signed certificate for all managers, rather than one for each manager node. In this case, the certificate files are copied automatically to any new manager nodes joining the cluster or being promoted to a manager role.
Universal Control Plane (UCP) is a containerized application that you can install on-premise or on a cloud infrastructure.
The first step to installing UCP is ensuring that your infrastructure has all of the requirements UCP needs to run. Also, you need to ensure that all nodes, physical and virtual, are running the same version of Docker Enterprise.
Important
If you are installing UCP on a public cloud platform, refer to the cloud-specific UCP installation documentation.
UCP is a containerized application that requires the commercially supported Docker Engine to run.
Install Docker Enterprise on each host that you plan to manage with UCP. View the supported platforms and click on your platform to get platform-specific instructions for installing Docker Enterprise.
Make sure you install the same Docker Enterprise version on all the
nodes. Also, if you’re creating virtual machine templates with Docker
Enterprise already installed, make sure the /etc/docker/key.json
file is not included in the virtual machine image. When provisioning the
virtual machine, restart the Docker daemon to generate a new
/etc/docker/key.json
file.
Skip this step if you want to use the defaults provided by UCP.
UCP uses named volumes to persist data. If you want to customize the drivers used to manage these volumes, you can create the volumes before installing UCP. When you install UCP, the installer will notice that the volumes already exist, and it will start using them.
If these volumes don’t exist, they’ll be automatically created when installing UCP.
To install UCP, you use the docker/ucp
image, which has commands to
install and manage UCP.
Make sure you follow the UCP System requirements for opening networking ports. Ensure that your hardware or software firewalls are open appropriately or disabled.
Use ssh to log in to the host where you want to install UCP.
Run the following command:
# Pull the latest version of UCP
docker image pull docker/ucp:3.2.5
# Install UCP
docker container run --rm -it --name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 install \
--host-address <node-ip-address> \
--interactive
This runs the install command in interactive mode, so that you’re prompted for any necessary configuration values. To find what other options are available in the install command, including how to install UCP on a system with SELinux enabled, check the reference documentation.
Important
UCP will install Project Calico for container-to-container communication for Kubernetes. A platform operator may choose to install an alternative CNI plugin, such as Weave or Flannel. Please see Install an unmanaged CNI plugin for more information.
Now that UCP is installed, you need to license it. To use UCP, you are required to have a Docker Enterprise subscription, or you can test the platform with a free trial license.
Go to Docker Hub to get a free trial license.
In your browser, navigate to the UCP web UI, log in with your administrator credentials and upload your license. Navigate to the Admin Settings page and in the left pane, click License.
Click Upload License and navigate to your license (.lic) file. When you’re finished selecting the license, UCP updates with the new settings.
To make your Docker swarm and UCP fault-tolerant and highly available, you can join more manager nodes to it. Manager nodes are the nodes in the swarm that perform the orchestration and swarm management tasks, and dispatch tasks for worker nodes to execute.
To join manager nodes to the swarm,
In the UCP web UI, navigate to the Nodes page, and click the Add Node button to add a new node.
In the Add Node page, check Add node as a manager to turn this node into a manager and replicate UCP for high-availability.
If you want to customize the network and port where the new node
listens for swarm management traffic, click Use a custom listen
address. Enter the IP address and port for the node to listen for
inbound cluster management traffic. The format is interface:port
or ip:port
. The default is 0.0.0.0:2377
.
If you want to customize the network and port that the new node
advertises to other swarm members for API access, click Use a
custom advertise address and enter the IP address and port. By
default, this is also the outbound address used by the new node to
contact UCP. The joining node should be able to contact itself at
this address. The format is interface:port
or ip:port
.
Click the copy icon to copy the docker swarm join
command that
nodes use to join the swarm.
For each manager node that you want to join to the swarm, log in using ssh and run the join command that you copied. After the join command completes, the node appears on the Nodes page in the UCP web UI.
Note
Skip the joining of worker nodes if you don’t want to add more nodes to run and scale your apps.
To add more computational resources to your swarm, you can join worker nodes. These nodes execute tasks assigned to them by the manager nodes. Follow the same steps as before, but don’t check the Add node as a manager option.
The procedure to install Universal Control Plane on a host is the same, whether the host has access to the internet or not.
The only difference when installing on an offline host is that instead of pulling the UCP images from Docker Hub, you use a computer that’s connected to the internet to download a single package with all the images. Then you copy this package to the host where you install UCP. The offline installation process works only if one of the following is true:
If the managers have access to Docker Hub while the workers don’t, installation will fail.
Use a computer with internet access to download the UCP package from the following links.
You can also use these links to get the UCP package from the command line:
$ wget <ucp-package-url> -O ucp.tar.gz
Now that you have the package in your local machine, you can transfer it to the machines where you want to install UCP.
For each machine that you want to manage with UCP:
Copy the UCP package to the machine.
$ scp ucp.tar.gz <user>@<host>
Use ssh to log in to the hosts where you transferred the package.
Load the UCP images.
Once the package is transferred to the hosts, you can use the
docker load
command, to load the Docker images from the tar
archive:
$ docker load -i ucp.tar.gz
Follow the same steps for the DTR binaries.
Now that the offline hosts have all the images needed to install UCP, you can install UCP on one of the manager nodes.
Universal Control Plane (UCP) can be installed on top of AWS without any customisation following the UCP install documentation. Therefore this document is optional, however if you are deploying Kubernetes workloads with UCP and want to leverage the AWS kubernetes cloud provider, which provides dynamic volume and loadbalancer provisioning then you should follow this guide. This guide is not required if you are only deploying swarm workloads.
The requirements for installing UCP on AWS are included in the following sections:
The instance’s host name must be named
ip-<private ip>.<region>.compute.internal
. For example:
ip-172-31-15-241.us-east-2.compute.internal
The instance must be tagged with
kubernetes.io/cluster/<UniqueID for Cluster>
and given a value of
owned
or shared
. If the resources created by the cluster is
considered owned and managed by the cluster, the value should be owned.
If the resources can be shared between multiple clusters, it should be
tagged as shared.
kubernetes.io/cluster/1729543642a6
owned
Manager nodes must have an instance profile with appropriate policies attached to enable introspection and provisioning of resources. The following example is very permissive:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [ "ec2:*" ],
"Resource": [ "*" ]
},
{
"Effect": "Allow",
"Action": [ "elasticloadbalancing:*" ],
"Resource": [ "*" ]
},
{
"Effect": "Allow",
"Action": [ "route53:*" ],
"Resource": [ "*" ]
},
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [ "arn:aws:s3:::kubernetes-*" ]
}
]
}
Worker nodes must have an instance profile with appropriate policies attached to enable access to dynamically provisioned resources. The following example is very permissive:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [ "arn:aws:s3:::kubernetes-*" ]
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:AttachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:DetachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [ "route53:*" ],
"Resource": [ "*" ]
}
}
The VPC must be tagged with
kubernetes.io/cluster/<UniqueID for Cluster>
and given a value of
owned
or shared
. If the resources created by the cluster is
considered owned and managed by the cluster, the value should be owned.
If the resources can be shared between multiple clusters, it should be
tagged shared.
kubernetes.io/cluster/1729543642a6
owned
Subnets must be tagged with
kubernetes.io/cluster/<UniqueID for Cluster>
and given a value of
owned
or shared
. If the resources created by the cluster is
considered owned and managed by the cluster, the value should be owned.
If the resources may be shared between multiple clusters, it should be
tagged shared. For example:
kubernetes.io/cluster/1729543642a6
owned
Once all pre-requisities have been met, run the following command to
install UCP on a manager node. The --host-address
flag maps to the
private IP address of the master node.
$ docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 install \
--host-address <ucp-ip> \
--cloud-provider aws \
--interactive
Universal Control Plane (UCP) closely integrates with Microsoft Azure for its Kubernetes Networking and Persistent Storage feature set. UCP deploys the Calico CNI provider. In Azure, the Calico CNI leverages the Azure networking infrastructure for data path networking and the Azure IPAM for IP address management. There are infrastructure prerequisites required prior to UCP installation for the Calico / Azure integration.
UCP configures the Azure IPAM module for Kubernetes to allocate IP addresses for Kubernetes pods. The Azure IPAM module requires each Azure VM which is part of the Kubernetes cluster to be configured with a pool of IP addresses.
There are two options for provisioning IPs for the Kubernetes cluster on Azure:
calico-node
daemonset and
provisions 128 IP addresses for each node by default.$ az network nic
ip-config create
, or an ARM template.You must meet the following infrastructure prerequisites to successfully deploy UCP on Azure. Failure to meet these prerequisites may result in significant errors during the installation process.
Contributor
access to the Azure
Resource Group hosting the UCP Nodes. This Service principal will be
used by Kubernetes to communicate with the Azure API. The Service
Principal ID and Secret Key are needed as part of the UCP
prerequisites. If you are using a separate Resource Group for the
networking components, the same Service Principal will need
Network Contributor
access to this Resource Group.UCP requires the following information for the installation:
subscriptionId
- The Azure Subscription ID in which the UCP
objects are being deployed.tenantId
- The Azure Active Directory Tenant ID in which the UCP
objects are being deployed.aadClientId
- The Azure Service Principal ID.aadClientSecret
- The Azure Service Principal Secret Key.For UCP to integrate with Microsoft Azure, all Linux UCP Manager and
Linux UCP Worker nodes in your cluster need an identical Azure
configuration file, azure.json
. Place this file within
/etc/kubernetes
on each host. Since the configuration file is owned
by root
, set its permissions to 0644
to ensure the container
user has read access.
The following is an example template for azure.json
. Replace ***
with real values, and leave the other parameters as is.
{
"cloud":"AzurePublicCloud",
"tenantId": "***",
"subscriptionId": "***",
"aadClientId": "***",
"aadClientSecret": "***",
"resourceGroup": "***",
"location": "***",
"subnetName": "***",
"securityGroupName": "***",
"vnetName": "***",
"useInstanceMetadata": true
}
There are some optional parameters for Azure deployments:
primaryAvailabilitySetName
- The Worker Nodes availability set.vnetResourceGroup
- The Virtual Network Resource group, if your
Azure Network objects live in a separate resource group.routeTableName
- If you have defined multiple Route tables within
an Azure subnet.Warning
You must follow these guidelines and either use the appropriate size network in Azure or take the proper action to fit within the subnet. Failure to follow these guidelines may cause significant issues during the installation process.
The subnet and the virtual network associated with the primary interface of the Azure VMs needs to be configured with a large enough address prefix/range. The number of required IP addresses depends on the workload and the number of nodes in the cluster.
For example, in a cluster of 256 nodes, make sure that the address space of the subnet and the virtual network can allocate at least 128 * 256 IP addresses, in order to run a maximum of 128 pods concurrently on a node. This would be in addition to initial IP allocations to VM network interface card (NICs) during Azure resource creation.
Accounting for IP addresses that are allocated to NICs during VM
bring-up, set the address space of the subnet and virtual network to
10.0.0.0/16
. This ensures that the network can dynamically allocate
at least 32768 addresses, plus a buffer for initial allocations for
primary IP addresses.
Note
The Azure IPAM module queries an Azure VM’s metadata to obtain a list
of IP addresses which are assigned to the VM’s NICs. The IPAM module
allocates these IP addresses to Kubernetes pods. You configure the IP
addresses as ipConfigurations
in the NICs associated with a VM or
scale set member, so that Azure IPAM can provide them to Kubernetes
when requested.
Configure IP Pools for each member of the VM scale set during
provisioning by associating multiple ipConfigurations
with the scale
set’s networkInterfaceConfigurations
. The following is an example
networkProfile
configuration for an ARM template that configures
pools of 32 IP addresses for each VM in the VM scale set.
"networkProfile": {
"networkInterfaceConfigurations": [
{
"name": "[variables('nicName')]",
"properties": {
"ipConfigurations": [
{
"name": "[variables('ipConfigName1')]",
"properties": {
"primary": "true",
"subnet": {
"id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
},
"loadBalancerBackendAddressPools": [
{
"id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/backendAddressPools/', variables('bePoolName'))]"
}
],
"loadBalancerInboundNatPools": [
{
"id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/inboundNatPools/', variables('natPoolName'))]"
}
]
}
},
{
"name": "[variables('ipConfigName2')]",
"properties": {
"subnet": {
"id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
}
}
}
.
.
.
{
"name": "[variables('ipConfigName32')]",
"properties": {
"subnet": {
"id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
}
}
}
],
"primary": "true"
}
}
]
}
During a UCP installation, a user can alter the number of Azure IP
addresses UCP will automatically provision for pods. By default, UCP
will provision 128 addresses, from the same Azure Subnet as the hosts,
for each VM in the cluster. However, if you have manually attached
additional IP addresses to the VMs (via an ARM Template, Azure CLI or
Azure Portal) or you are deploying in to small Azure subnet (less than
/16), an --azure-ip-count
flag can be used at install time.
Note
Do not set the --azure-ip-count
variable to a value of less than
6 if you have not manually provisioned additional IP addresses for
each VM. The UCP installation will need at least 6 IP addresses to
allocate to the core UCP components that run as Kubernetes pods. This
is in addition to the VM’s private IP address.
Below are some example scenarios which require the --azure-ip-count
variable to be defined.
Scenario 1 - Manually Provisioned Addresses
If you have manually provisioned additional IP addresses for each VM,
and want to disable UCP from dynamically provisioning more IP addresses
for you, then you would pass --azure-ip-count 0
into the UCP
installation command.
Scenario 2 - Reducing the number of Provisioned Addresses
If you want to reduce the number of IP addresses dynamically allocated from 128 addresses to a custom value due to:
For example if you wanted to provision 16 addresses per VM, then you
would pass --azure-ip-count 16
into the UCP installation command.
If you need to adjust this value post-installation, refer to the instructions on how to download the UCP configuration file, change the value, and update the configuration via the API. If you reduce the value post-installation, existing VMs will not be reconciled, and you will have to manually edit the IP count in Azure.
Run the following command to install UCP on a manager node. The
--pod-cidr
option maps to the IP address range that you have
configured for the Azure subnet, and the --host-address
maps to the
private IP address of the master node. Finally if you want to adjust the
amount of IP addresses provisioned to each VM pass --azure-ip-count
.
Note
The pod-cidr
range must match the Azure Virtual Network’s Subnet
attached the hosts. For example, if the Azure Virtual Network had the
range 172.0.0.0/16
with VMs provisioned on an Azure Subnet of
172.0.1.0/24
, then the Pod CIDR should also be 172.0.1.0/24
.
docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 install \
--host-address <ucp-ip> \
--pod-cidr <ip-address-range> \
--cloud-provider Azure \
--interactive
This document describes how to create Azure custom roles to deploy Docker Enterprise resources.
A resource group is a container that holds resources for an Azure solution. These resources are the virtual machines (VMs), networks, and storage accounts associated with the swarm.
To create a custom, all-in-one role with permissions to deploy a Docker Enterprise cluster into a single resource group:
Create the role permissions JSON file.
{
"Name": "Docker Platform All-in-One",
"IsCustom": true,
"Description": "Can install and manage Docker platform.",
"Actions": [
"Microsoft.Authorization/*/read",
"Microsoft.Authorization/roleAssignments/write",
"Microsoft.Compute/availabilitySets/read",
"Microsoft.Compute/availabilitySets/write",
"Microsoft.Compute/disks/read",
"Microsoft.Compute/disks/write",
"Microsoft.Compute/virtualMachines/extensions/read",
"Microsoft.Compute/virtualMachines/extensions/write",
"Microsoft.Compute/virtualMachines/read",
"Microsoft.Compute/virtualMachines/write",
"Microsoft.Network/loadBalancers/read",
"Microsoft.Network/loadBalancers/write",
"Microsoft.Network/loadBalancers/backendAddressPools/join/action",
"Microsoft.Network/networkInterfaces/read",
"Microsoft.Network/networkInterfaces/write",
"Microsoft.Network/networkInterfaces/join/action",
"Microsoft.Network/networkSecurityGroups/read",
"Microsoft.Network/networkSecurityGroups/write",
"Microsoft.Network/networkSecurityGroups/join/action",
"Microsoft.Network/networkSecurityGroups/securityRules/read",
"Microsoft.Network/networkSecurityGroups/securityRules/write",
"Microsoft.Network/publicIPAddresses/read",
"Microsoft.Network/publicIPAddresses/write",
"Microsoft.Network/publicIPAddresses/join/action",
"Microsoft.Network/virtualNetworks/read",
"Microsoft.Network/virtualNetworks/write",
"Microsoft.Network/virtualNetworks/subnets/read",
"Microsoft.Network/virtualNetworks/subnets/write",
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Resources/subscriptions/resourcegroups/read",
"Microsoft.Resources/subscriptions/resourcegroups/write",
"Microsoft.Security/advancedThreatProtectionSettings/read",
"Microsoft.Security/advancedThreatProtectionSettings/write",
"Microsoft.Storage/*/read",
"Microsoft.Storage/storageAccounts/listKeys/action",
"Microsoft.Storage/storageAccounts/write"
],
"NotActions": [],
"AssignableScopes": [
"/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
]
}
Create the Azure RBAC role.
az role definition create --role-definition all-in-one-role.json
Compute resources act as servers for running containers.
To create a custom role to deploy Docker Enterprise compute resources only:
Create the role permissions JSON file.
{
"Name": "Docker Platform",
"IsCustom": true,
"Description": "Can install and run Docker platform.",
"Actions": [
"Microsoft.Authorization/*/read",
"Microsoft.Authorization/roleAssignments/write",
"Microsoft.Compute/availabilitySets/read",
"Microsoft.Compute/availabilitySets/write",
"Microsoft.Compute/disks/read",
"Microsoft.Compute/disks/write",
"Microsoft.Compute/virtualMachines/extensions/read",
"Microsoft.Compute/virtualMachines/extensions/write",
"Microsoft.Compute/virtualMachines/read",
"Microsoft.Compute/virtualMachines/write",
"Microsoft.Network/loadBalancers/read",
"Microsoft.Network/loadBalancers/write",
"Microsoft.Network/networkInterfaces/read",
"Microsoft.Network/networkInterfaces/write",
"Microsoft.Network/networkInterfaces/join/action",
"Microsoft.Network/publicIPAddresses/read",
"Microsoft.Network/virtualNetworks/read",
"Microsoft.Network/virtualNetworks/subnets/read",
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Resources/subscriptions/resourcegroups/read",
"Microsoft.Resources/subscriptions/resourcegroups/write",
"Microsoft.Security/advancedThreatProtectionSettings/read",
"Microsoft.Security/advancedThreatProtectionSettings/write",
"Microsoft.Storage/storageAccounts/read",
"Microsoft.Storage/storageAccounts/listKeys/action",
"Microsoft.Storage/storageAccounts/write"
],
"NotActions": [],
"AssignableScopes": [
"/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
]
}
Create the Docker Platform RBAC role.
az role definition create --role-definition platform-role.json
Network resources are services inside your cluster. These resources can include virtual networks, security groups, address pools, and gateways.
To create a custom role to deploy Docker Enterprise network resources only:
Create the role permissions JSON file.
{
"Name": "Docker Networking",
"IsCustom": true,
"Description": "Can install and manage Docker platform networking.",
"Actions": [
"Microsoft.Authorization/*/read",
"Microsoft.Network/loadBalancers/read",
"Microsoft.Network/loadBalancers/write",
"Microsoft.Network/loadBalancers/backendAddressPools/join/action",
"Microsoft.Network/networkInterfaces/read",
"Microsoft.Network/networkInterfaces/write",
"Microsoft.Network/networkInterfaces/join/action",
"Microsoft.Network/networkSecurityGroups/read",
"Microsoft.Network/networkSecurityGroups/write",
"Microsoft.Network/networkSecurityGroups/join/action",
"Microsoft.Network/networkSecurityGroups/securityRules/read",
"Microsoft.Network/networkSecurityGroups/securityRules/write",
"Microsoft.Network/publicIPAddresses/read",
"Microsoft.Network/publicIPAddresses/write",
"Microsoft.Network/publicIPAddresses/join/action",
"Microsoft.Network/virtualNetworks/read",
"Microsoft.Network/virtualNetworks/write",
"Microsoft.Network/virtualNetworks/subnets/read",
"Microsoft.Network/virtualNetworks/subnets/write",
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Resources/subscriptions/resourcegroups/read",
"Microsoft.Resources/subscriptions/resourcegroups/write"
],
"NotActions": [],
"AssignableScopes": [
"/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
]
}
Create the Docker Networking RBAC role.
az role definition create --role-definition networking-role.json
Before upgrading to a new version of UCP, check the release notes for this version. There you’ll find information about the new features, breaking changes, and other relevant information for upgrading to a particular version.
As part of the upgrade process, you’ll upgrade the Docker EE Engine installed on each node of the cluster to version 17.06.2-ee-8 or higher. You should plan for the upgrade to take place outside of business hours, to ensure there’s minimal impact to your users.
Also, don’t make changes to UCP configurations while you’re upgrading it. This can lead to misconfigurations that are difficult to troubleshoot.
Ensure that your cluster nodes meet the minimum requirements for memory and disk space. In particular, manager nodes must have at least 8GB of memory.
Ensure that your cluster nodes meet the minimum requirements for port openings. The ports in use are documented in the UCP system requirements.
Note
If you are upgrading a cluster to UCP 3.0.2 or higher on Microsoft Azure then please ensure all of the Azure prerequisites are met.
Before starting an upgrade, make sure that your cluster is healthy. If a problem occurs, this makes it easier to find and troubleshoot it.
Create a backup of your cluster. This allows you to recover if something goes wrong during the upgrade process.
Note
The backup archive is version-specific, so you can’t use it during the upgrade process. For example, if you create a backup archive for a UCP 2.2 cluster, you can’t use the archive file after you upgrade to UCP 3.0.
For each node that is part of your cluster, upgrade the Docker Engine installed on that node to Docker Engine version 19.03 or higher. Be sure to install the Docker Enterprise Edition.
Starting with the manager nodes, and then worker nodes:
Note
In your browser, navigate to Nodes in the UCP web interface, and check that the node is healthy and is part of the cluster.
You can upgrade UCP from the web or the command line interface.
When an upgrade is available for a UCP installation, a banner appears.
Clicking this message takes an admin user directly to the upgrade process. It can be found under the Upgrade tab of the Admin Settings section.
In the Available Versions dropdown, select the version you want to update to and click Upgrade UCP.
During the upgrade, the web interface will be unavailable, and you should wait until completion before continuing to interact with it. When the upgrade completes, you’ll see a notification that a newer version of the web interface is available and a browser refresh is required to see it.
To upgrade from the CLI, log into a UCP manager node using SSH, and run:
# Get the latest version of UCP
docker image pull docker/ucp:3.1.8
docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 \
upgrade --interactive
This runs the upgrade command in interactive mode, which will prompt you for required configuration values.
Once the upgrade finishes, navigate to the UCP web interface and make sure that all the nodes managed by UCP are healthy.
Upgrading Universal Control Plane is the same, whether your hosts have access to the internet or not.
The only difference when installing on an offline host is that instead of pulling the UCP images from Docker Hub, you use a computer that’s connected to the internet to download a single package with all the images. Then you copy this package to the host where you upgrade UCP.
You can also use these links to get the UCP package from the command line:
$ wget <ucp-package-url> -O ucp.tar.gz
Now that you have the package in your local machine, you can transfer it to the machines where you want to upgrade UCP.
For each machine that you want to manage with UCP:
Copy the offline package to the machine.
$ scp ucp.tar.gz <user>@<host>
Use ssh to log in to the hosts where you transferred the package.
Load the UCP images.
Once the package is transferred to the hosts, you can use the
docker load
command, to load the Docker images from the tar
archive:
$ docker load -i ucp.tar.gz
Now that the offline hosts have all the images needed to upgrade UCP, you can upgrade UCP.
UCP is designed to scale as your applications grow in size and usage. You can add and remove nodes from the cluster to make it scale to your needs.
You can also uninstall UCP from your cluster. In this case, the UCP services are stopped and removed, but your Docker Engines will continue running in swarm mode. You applications will continue running normally.
If you wish to remove a single node from the UCP cluster, you should instead remove that node from the cluster.
After you uninstall UCP from the cluster, you’ll no longer be able to
enforce role-based access control (RBAC) to the cluster, or have a
centralized way to monitor and manage the cluster. After uninstalling
UCP from the cluster, you will no longer be able to join new nodes using
docker swarm join
, unless you reinstall UCP.
To uninstall UCP, log in to a manager node using ssh, and run the following command:
docker container run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock \
--name ucp \
docker/ucp:3.2.5 uninstall-ucp --interactive
This runs the uninstall command in interactive mode, so that you are prompted for any necessary configuration values.
The UCP configuration is kept in case you want to reinstall UCP with the
same configuration. If you want to also delete the configuration, run
the uninstall command with the --purge-config
option.
Refer to the reference documentation to learn the options available.
Once the uninstall command finishes, UCP is completely removed from all the nodes in the cluster. You don’t need to run the command again from other nodes.
After uninstalling UCP, the nodes in your cluster will still be in swarm
mode, but you can’t join new nodes until you reinstall UCP, because
swarm mode relies on UCP to provide the CA certificates that allow nodes
in the cluster to identify one another. Also, since swarm mode is no
longer controlling its own certificates, if the certificates expire
after you uninstall UCP, the nodes in the swarm won’t be able to
communicate at all. To fix this, either reinstall UCP before the
certificates expire or disable swarm mode by running
docker swarm leave --force
on every node.
When you install UCP, the Calico network plugin changes the host’s IP tables. When you uninstall UCP, the IP tables aren’t reverted to their previous state. After you uninstall UCP, restart the node to restore its IP tables.
With UCP, you can add labels to your nodes. Labels are metadata that describe the node, like its role (development, QA, production), its region (US, EU, APAC), or the kind of disk (HDD, SSD). Once you have labeled your nodes, you can add deployment constraints to your services, to ensure they are scheduled on a node with a specific label.
For example, you can apply labels based on their role in the development lifecycle, or the hardware resources they have.
Don’t create labels for authorization and permissions to resources. Instead, use resource sets, either UCP collections or Kubernetes namespaces, to organize access to your cluster.
In this example, we’ll apply the ssd
label to a node. Next, we’ll
deploy a service with a deployment constraint to make sure the service
is always scheduled to run on a node that has the ssd
label.
Log in with administrator credentials in the UCP web interface.
Select Nodes in the left-hand navigation menu.
In the nodes list, select the node to which you want to apply labels.
In the details pane, select the edit node icon in the upper-right corner to edit the node.
In the Edit Node page, scroll down to the Labels section.
Select Add Label.
Add a label with the key disk
and a value of ssd
.
Click Save then dismiss the Edit Node page.
In the node’s details pane, select Labels to view the labels that are applied to the node.
You can also do this from the CLI by running:
docker node update --label-add <key>=<value> <node-id>
When deploying a service, you can specify constraints, so that the service gets scheduled only on a node that has a label that fulfills all of the constraints you specify.
In this example, when users deploy a service, they can add a constraint
for the service to be scheduled only on nodes that have SSD storage:
node.labels.disk == ssd
.
Navigate to the Stacks page.
Name the new stack “wordpress”.
Under Orchestrator Mode, select Swarm Services.
In the docker-compose.yml editor, paste the following stack file.
version: "3.1"
services:
db:
image: mysql:5.7
deploy:
placement:
constraints:
- node.labels.disk == ssd
restart_policy:
condition: on-failure
networks:
- wordpress-net
environment:
MYSQL_ROOT_PASSWORD: wordpress
MYSQL_DATABASE: wordpress
MYSQL_USER: wordpress
MYSQL_PASSWORD: wordpress
wordpress:
depends_on:
- db
image: wordpress:latest
deploy:
replicas: 1
placement:
constraints:
- node.labels.disk == ssd
restart_policy:
condition: on-failure
max_attempts: 3
networks:
- wordpress-net
ports:
- "8000:80"
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_PASSWORD: wordpress
networks:
wordpress-net:
Click Create to deploy the stack, and when the stack deploys, click Done.
Navigate to the Nodes page, and click the node that has the
disk
label. In the details pane, click the Inspect Resource
drop-down menu and select Containers.
Dismiss the filter and navigate to the Nodes page.
Click a node that doesn’t have the disk
label. In the details pane,
click the Inspect Resource drop-down menu and select Containers.
There are no WordPress containers scheduled on the node. Dismiss the
filter.
You can declare the deployment constraints in your docker-compose.yml file or when you’re creating a stack. Also, you can apply them when you’re creating a service.
To check if a service has deployment constraints, navigate to the Services page and choose the service that you want to check. In the details pane, click Constraints to list the constraint labels.
To edit the constraints on the service, click Configure and select Details to open the Update Service page. Click Scheduling to view the constraints.
You can add or remove deployment constraints on this page.
UCP always runs with HTTPS enabled. When you connect to UCP, you need to make sure that the hostname that you use to connect is recognized by UCP’s certificates. If, for instance, you put UCP behind a load balancer that forwards its traffic to your UCP instance, your requests will be for the load balancer’s hostname or IP address, not UCP’s. UCP will reject these requests unless you include the load balancer’s address as a Subject Alternative Name (or SAN) in UCP’s certificates.
If you use your own TLS certificates, make sure that they have the correct SAN values.
If you want to use the self-signed certificate that UCP has out of the
box, you can set up the SANs when you install UCP with the --san
argument. You can also add them after installation.
In the UCP web UI, log in with administrator credentials and navigate to the Nodes page.
Click on a manager node, and in the details pane, click Configure and select Details.
In the SANs section, click Add SAN, and enter one or more SANs for the cluster.
Once you’re done, click Save.
You will have to do this on every existsing manager node in the cluster, but once you have done so, the SANs are applied automatically to any new manager nodes that join the cluster.
You can also do this from the CLI by first running:
docker node inspect --format '{{ index .Spec.Labels "com.docker.ucp.SANs" }}' <node-id>
default-cs,127.0.0.1,172.17.0.1
This will get the current set of SANs for the given manager node. Append
your desired SAN to this list, for example
default-cs,127.0.0.1,172.17.0.1,example.com
, and then run:
docker node update --label-add com.docker.ucp.SANs=<SANs-list> <node-id>
<SANs-list>
is the list of SANs with your new SAN appended at the
end. As in the web UI, you must do this for every manager node.
Prometheus is an open-source systems monitoring and alerting toolkit. You can configure Docker as a Prometheus target. This topic shows you how to configure Docker, set up Prometheus to run as a Docker container, and monitor your Docker instance using Prometheus.
In UCP 3.0, Prometheus servers were standard containers. In UCP 3.1, Prometheus runs as a Kubernetes deployment. By default, this will be a DaemonSet that runs on every manager node. One benefit of this change is you can set the DaemonSet to not schedule on any nodes, which effectively disables Prometheus if you don’t use the UCP web interface.
The data is stored locally on disk for each Prometheus server, so data is not replicated on new managers or if you schedule Prometheus to run on a new node. Metrics are not kept longer than 24 hours.
Events, logs, and metrics are sources of data that provide observability of your cluster. Metrics monitors numerical data values that have a time-series component. There are several sources from which metrics can be derived, each providing different kinds of meaning for a business and its applications.
The Docker Enterprise platform provides a base set of metrics that gets you running and into production without having to rely on external or third-party tools. Docker strongly encourages the use of additional monitoring to provide more comprehensive visibility into your specific Docker environment, but recognizes the need for a basic set of metrics built into the product. The following are examples of these metrics:
These are high-level aggregate metrics that typically combine technical, financial, and organizational data to create metrics for business leaders of the IT infrastructure. Some examples of business metrics might be:
These are metrics about domain of APM tools like AppDynamics or DynaTrace and provide metrics about the state or performance of the application itself.
Docker Enterprise 2.1 does not collect or expose application level metrics.
The following are metrics Docker Enterprise 2.1 collects, aggregates, and exposes:
These are metrics about the state of services running on the container platform. These types of metrics have very low cardinality, meaning the values are typically from a small fixed set of possibilities, commonly binary.
Web UI disk usage metrics, including free space, only reflect the Docker
managed portion of the filesystem: /var/lib/docker
. To monitor the
total space available on each filesystem of a UCP worker or manager, you
must deploy a third party monitoring solution to monitor the operating
system.
UCP deploys Prometheus by default on the manager nodes to provide a built-in metrics backend. For cluster sizes over 100 nodes or for use cases where scraping metrics from the Prometheus instances are needed, we recommend that you deploy Prometheus on dedicated worker nodes in the cluster.
To deploy Prometheus on worker nodes in a cluster:
Begin by sourcing an admin bundle.
Verify that ucp-metrics pods are running on all managers.
$ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide NAME
READY STATUS RESTARTS AGE IP NODE
ucp-metrics-hvkr7 3/3 Running 0 4h
192.168.80.66 3a724a-0
Add a Kubernetes node label to one or more workers. Here we add a label with key “ucp-metrics” and value “” to a node with name “3a724a-1”.
$ kubectl label node 3a724a-1 ucp-metrics=
node "test-3a724a-1" labeled
SELinux Prometheus Deployment for UCP 3.1.0, 3.1.1, and 3.1.2
If you are using SELinux, you must label your ucp-node-certs
directories properly on your worker nodes before you move the ucp-metrics
workload to them. To run ucp-metrics on a worker node, update the
ucp-node-certs
label by running sudo chcon -R
system_u:object_r:container_file_t:s0
/var/lib/docker/volumes/ucp-node-certs/_data
.
Patch the ucp-metrics DaemonSet’s nodeSelector using the same key and value used for the node label. This example shows the key “ucp-metrics” and the value “”.
$ kubectl -n kube-system patch daemonset ucp-metrics --type json -p
'[{"op": "replace", "path": "/spec/template/spec/nodeSelector", "value":
{"ucp-metrics": ""}}]' daemonset "ucp-metrics" patched
Observe that ucp-metrics pods are running only on the labeled workers.
$ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide NAME
READY STATUS RESTARTS AGE IP NODE
ucp-metrics-88lzx 3/3 Running 0 12s
192.168.83.1 3a724a-1 ucp-metrics-hvkr7 3/3 Terminating 0
4h 192.168.80.66 3a724a-0
To configure your external Prometheus server to scrape metrics from Prometheus in UCP:
Begin by sourcing an admin bundle.
Create a Kubernetes secret containing your bundle’s TLS material.
(cd $DOCKER_CERT_PATH && kubectl create secret generic prometheus --from-file=ca.pem --from-file=cert.pem --from-file=key.pem)
Create a Prometheus deployment and ClusterIP service using YAML as follows.
On AWS with Kube’s cloud provider configured, you can replace
ClusterIP
with LoadBalancer
in the service YAML then access
the service through the load balancer. If running Prometheus external
to UCP, change the following domain for the inventory container in
the Prometheus deployment from
ucp-controller.kube-system.svc.cluster.local
to an external
domain to access UCP from the Prometheus node.
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus
data:
prometheus.yaml: |
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'ucp'
tls_config:
ca_file: /bundle/ca.pem
cert_file: /bundle/cert.pem
key_file: /bundle/key.pem
server_name: proxy.local
scheme: https
file_sd_configs:
- files:
- /inventory/inventory.json
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
spec:
replicas: 2
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: inventory
image: alpine
command: ["sh", "-c"]
args:
- apk add --no-cache curl &&
while :; do
curl -Ss --cacert /bundle/ca.pem --cert /bundle/cert.pem --key /bundle/key.pem --output /inventory/inventory.json https://ucp-controller.kube-system.svc.cluster.local/metricsdiscovery;
sleep 15;
done
volumeMounts:
- name: bundle
mountPath: /bundle
- name: inventory
mountPath: /inventory
- name: prometheus
image: prom/prometheus
command: ["/bin/prometheus"]
args:
- --config.file=/config/prometheus.yaml
- --storage.tsdb.path=/prometheus
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
volumeMounts:
- name: bundle
mountPath: /bundle
- name: config
mountPath: /config
- name: inventory
mountPath: /inventory
volumes:
- name: bundle
secret:
secretName: prometheus
- name: config
configMap:
name: prometheus
- name: inventory
emptyDir:
medium: Memory
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
ports:
- port: 9090
targetPort: 9090
selector:
app: prometheus
sessionAffinity: ClientIP
EOF
Determine the service ClusterIP.
$ kubectl get service prometheus
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus ClusterIP 10.96.254.107 <none> 9090/TCP 1h
Forward port 9090 on the local host to the ClusterIP. The tunnel created does not need to be kept alive and is only intended to expose the Prometheus UI.
ssh -L 9090:10.96.254.107:9090 ANY_NODE
Visit http://127.0.0.1:9090
to explore the UCP metrics being
collected by Prometheus.
The following table lists the metrics that UCP exposes in Prometheus,
along with descriptions. Note that only the metrics labeled with
ucp_
are documented. Other metrics are exposed in Prometheus but are
not documented.
Name | Units | Description | Labels | Metric source |
---|---|---|---|---|
ucp_controller_services |
number of services | The total number of Swarm services. | Controller | |
ucp_engine_container_cpu_percent |
percentage | The percentage of CPU time this container is using. | container labels | Node |
ucp_engine_container_cpu_total_time_nanoseconds |
nanoseconds | Total CPU time used by this container in nanoseconds. | container labels | Node |
ucp_engine_container_health |
0.0 or 1.0 | Whether or not this container is healthy, according to its healthcheck. Note that if this value is 0, it just means that the container is not reporting healthy; it might not have a healthcheck defined at all, or its healthcheck might not have returned any results yet. | container labels | Node |
ucp_engine_container_memory_max_usage_bytes |
bytes | Maximum memory used by this container in bytes. | container labels | Node |
ucp_engine_container_memory_usage_bytes |
bytes | Current memory used by this container in bytes. | container labels | Node |
ucp_engine_container_memory_usage_percent |
percentage | Percentage of total node memory currently being used by this container. | container labels | Node |
ucp_engine_container_network_rx_bytes_total |
bytes | Number of bytes received by this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_network_rx_dropped_packets_total |
number of packets | Number of packets bound for this container on this network that were dropped in the last sample. | container networking labels | Node |
ucp_engine_container_network_rx_errors_total |
number of errors | Number of received network errors for this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_network_rx_packets_total |
number of packets | Number of received packets for this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_network_tx_bytes_total |
bytes | Number of bytes sent by this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_network_tx_dropped_packets_total |
number of packets | Number of packets sent from this container on this network that were dropped in the last sample. | container networking labels | Node |
ucp_engine_container_network_tx_errors_total |
number of errors | Number of sent network errors for this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_network_tx_packets_total |
number of packets | Number of sent packets for this container on this network in the last sample. | container networking labels | Node |
ucp_engine_container_unhealth |
0.0 or 1.0 | Whether or not this container is unhealthy, according to its healthcheck. Note that if this value is 0, it just means that the container is not reporting unhealthy; it might not have a healthcheck defined at all, or its healthcheck might not have returned any results yet. | container labels | Node |
ucp_engine_containers |
number of containers | Total number of containers on this node. | node labels | Node |
ucp_engine_cpu_total_time_nanoseconds |
nanoseconds | System CPU time used by this container in nanoseconds. | container labels | Node |
ucp_engine_disk_free_bytes |
bytes | Free disk space on the Docker root directory on this node in bytes. Note that this metric is not available for Windows nodes. | node labels | Node |
ucp_engine_disk_total_bytes |
bytes | Total disk space on the Docker root directory on this node in bytes. Note that this metric is not available for Windows nodes. | node labels | Node |
ucp_engine_images |
number of images | Total number of images on this node. | node labels | Node |
ucp_engine_memory_total_bytes |
bytes | Total amount of memory on this node in bytes. | node labels | Node |
ucp_engine_networks |
number of networks | Total number of networks on this node. | node labels | Node |
ucp_engine_node_health |
0.0 or 1.0 | Whether or not this node is healthy, as determined by UCP. | nodeName: node name, nodeAddr: node IP address | Controller |
ucp_engine_num_cpu_cores |
number of cores | Number of CPU cores on this node. | node labels | Node |
ucp_engine_pod_container_ready |
0.0 or 1.0 | Whether or not this container in a Kubernetes pod is ready, as determined by its readiness probe. | pod labels | Controller |
ucp_engine_pod_ready |
0.0 or 1.0 | Whether or not this Kubernetes pod is ready, as determined by its readiness probe. | pod labels | Controller |
ucp_engine_volumes |
number of volumes | Total number of volumes on this node. | node labels | Node |
Metrics exposed by UCP in Prometheus have standardized labels, depending on the resource that they are measuring. The following table lists some of the labels that are used, along with their values:
Label name | Value |
---|---|
collection |
The collection ID of the collection this container is in, if any. |
container |
The ID of this container. |
image |
The name of this container’s image. |
manager |
“true” if the container’s node is a UCP manager, “false” otherwise. |
name |
The name of the container. |
podName |
If this container is part of a Kubernetes pod, this is the pod’s name. |
podNamespace |
If this container is part of a Kubernetes pod, this is the pod’s namespace. |
podContainerName |
If this container is part of a Kubernetes pod, this is the container’s name in the pod spec. |
service |
If this container is part of a Swarm service, this is the service ID. |
stack |
If this container is part of a Docker compose stack, this is the name of the stack. |
The following metrics measure network activity for a given network attached to a given container. They have the same labels as Container labels, with one addition:
Label name | Value |
---|---|
network |
The ID of the network. |
Label name | Value |
---|---|
manager |
“true” if the node is a UCP manager, “false” otherwise. |
UCP exports metrics on every node and also exports additional metrics from every controller. The metrics that are exported from controllers are cluster-scoped, for example, the total number of Swarm services. Metrics that are exported from nodes are specific to those nodes, for example, the total memory on that node.
UCP 3.0 used its own role-based access control (RBAC) for Kubernetes clusters. New in UCP 3.1 is the ability to use native Kubernetes RBAC. The benefits of doing this are:
Kubernetes RBAC is turned on by default for Kubernetes clusters when customers upgrade to UCP 3.1.
Starting with UCP 3.1, Kubernetes and Swarm roles have separate views. You can view all the roles for a particular cluster under Access Control then Roles. Select Kubernetes or Swarm to view the specific roles for each.
You create Kubernetes roles either through the CLI using kubectl
or
through the UCP web interface.
To create a Kubernetes role in the UCP web interface:
From the UCP UI, select Access Control.
From the left navigation menu, select Roles.
Select the Kubernetes tab at the top of the window.
Select Create to create a Kubernetes role object.
Select a namespace from the Namespace drop-down list. Selecting a
specific namespace creates a role for use in that namespace, but
selecting all namespaces creates a ClusterRole
where you can
create rules for cluster-scoped Kubernetes resources as well as
namespaced resources.
Provide the YAML for the role, either by entering it in the Object YAML editor or select Click to upload a .yml file to choose and upload a .yml file instead.
When you have finished specifying the YAML, Select Create to complete role creation.
Kubernetes provides two types of role grants:
ClusterRoleBinding
which applies to all namespacesRoleBinding
which applies to a specific namespaceTo create a grant for a Kubernetes role in the UCP web interface:
From the UCP UI, select Access Control.
From the left navigation menu, select Grants.
Select the Kubernetes tab at the top of the window. All grants to Kubernetes roles can be viewed in the Kubernetes tab.
Select Create New Grant to start the Create Role Binding wizard and create a new grant for a given user, team or service.
Select the subject type. Your choices are:
To create a user role binding, select a username from the Users drop-down list then select Next.
Select a resource set for the subject. The default namespace is
automatically selected. To use a different namespace, select the
Select Namespace button next to the desired namespace. For
Cluster Role Binding
, slide the Apply Role Binding to all
namespaces selector to the right.
Select Next to continue.
Select the Cluster Role from the drop-down list. If you create a
ClusterRoleBinding
(by selecting Apply Role Binding to all
namespaces) then you may only select ClusterRoles. If you select a
specific namespace, you can choose any role from that namespace or
any ClusterRole.
Select Create to complete creating the grant.
Audit logs are a chronological record of security-relevant activities by individual users, administrators or software components that have affected the system. They are focused on external user/agent actions and security rather than understanding state or events of the system itself.
Audit logs capture all HTTP actions (GET, PUT, POST, PATCH, DELETE) to all UCP API, Swarm API and Kubernetes API endpoints that are invoked (except for the ignored list) and sent to Docker Engine via stdout. Creating audit logs is a UCP component that integrates with Swarm, Kubernetes, and UCP APIs.
To allow more control to administrators over the audit logging, three audit logging levels are provided:
Note
Once UCP audit logging has been enabled, audit logs can be found
within the container logs of the ucp-controller
container on each
UCP manager node. Please ensure you have a logging driver configured
appropriately with log rotation set as audit logging can start to generate a
lot of data.
You can use audit logs to help with the following use cases:
UCP audit logging can be enabled via the UCP web user interface, the UCP API or via the UCP configuration file.
Download the UCP Client bundle from the command line.
Retrieve JSON for current audit log configuration.
export DOCKER_CERT_PATH=~/ucp-bundle-dir/
curl --cert ${DOCKER_CERT_PATH}/cert.pem --key ${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -X GET https://ucp-domain/api/ucp/config/logging > auditlog.json
Open auditlog.json to modify the ‘auditlevel’ field to metadata
or request
.
{
"logLevel": "INFO",
"auditLevel": "metadata",
"supportDumpIncludeAuditLogs": false
}
Send the JSON request for the auditlog config with the same API path
but with the PUT
method.
curl --cert ${DOCKER_CERT_PATH}/cert.pem --key ${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -H "Content-Type: application/json" -X PUT --data $(cat auditlog.json) https://ucp-domain/api/ucp/config/logging
Enabling UCP audit logging via the UCP configuration file can be done before or after a UCP installation.
The section of the UCP configuration file that controls UCP auditing logging is:
[audit_log_configuration]
level = "metadata"
support_dump_include_audit_logs = false
The supported variables for level
are ""
, "metadata"
or
"request"
.
Note
The support_dump_include_audit_logs
flag specifies whether user
identification information from the ucp-controller container logs is
included in the support dump. To prevent this information from being
sent with the support dump, make sure that
support_dump_include_audit_logs
is set to false
. When
disabled, the support dump collection tool filters out any lines from
the ucp-controller
container logs that contain the substring
auditID
. {: .important}
The audit logs are exposed today through the ucp-controller
logs.
You can access these logs locally through the Docker CLI or through an
external container logging solution, such as ELK.
To access audit logs using the Docker CLI:
docker logs
to obtain audit logs. In the following example,
we are tailing the command to show the last log entry.$ docker logs ucp-controller --tail 1
{"audit":{"auditID":"f8ce4684-cb55-4c88-652c-d2ebd2e9365e","kind":"docker-swarm","level":"metadata","metadata":{"creationTimestamp":null},"requestReceivedTimestamp":"2019-01-30T17:21:45.316157Z","requestURI":"/metricsservice/query?query=(%20(sum%20by%20(instance)%20(ucp_engine_container_memory_usage_bytes%7Bmanager%3D%22true%22%7D))%20%2F%20(sum%20by%20(instance)%20(ucp_engine_memory_total_bytes%7Bmanager%3D%22true%22%7D))%20)%20*%20100\u0026time=2019-01-30T17%3A21%3A45.286Z","sourceIPs":["172.31.45.250:48516"],"stage":"RequestReceived","stageTimestamp":null,"timestamp":null,"user":{"extra":{"licenseKey":["FHy6u1SSg_U_Fbo24yYUmtbH-ixRlwrpEQpdO_ntmkoz"],"username":["admin"]},"uid":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a","username":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a"},"verb":"GET"},"level":"info","msg":"audit","time":"2019-01-30T17:21:45Z"}
Here is a sample audit log for a Kubernetes cluster.
{"audit"; {
"metadata": {...},
"level": "Metadata",
"timestamp": "2018-08-07T22:10:35Z",
"auditID": "7559d301-fa6b-4ad6-901c-b587fab75277",
"stage": "RequestReceived",
"requestURI": "/api/v1/namespaces/default/pods",
"verb": "list",
"user": {"username": "alice",...},
"sourceIPs": ["127.0.0.1"],
...,
"requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}
Here is a sample audit log for a Swarm cluster.
{"audit"; {
"metadata": {...},
"level": "Metadata",
"timestamp": "2018-08-07T22:10:35Z",
"auditID": "7559d301-94e7-4ad6-901c-b587fab31512",
"stage": "RequestReceived",
"requestURI": "/v1.30/configs/create",
"verb": "post",
"user": {"username": "alice",...},
"sourceIPs": ["127.0.0.1"],
...,
"requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}
The following API endpoints are ignored since they are not considered security events and may create a large amount of log entries.
Information for the following API endpoints is redacted from the audit logs for security purposes:
/secrets/create
(POST)/secrets/{id}/update
(POST)/swarm/join
(POST)/swarm/update
(POST) -/auth/login
(POST)SAML is commonly supported by enterprise authentication systems. SAML-based single sign-on (SSO) gives you access to UCP through a SAML 2.0-compliant identity provider.
The identity providers UCP supports are Okta and ADFS.
There are values your identity provider needs for successful integration with UCP, as follows. These values can vary between identity providers. Consult your identity provider documentation for instructions on providing these values as part of their integration process.
Okta integration requires these values:
/enzi/v0/saml/acs
. For example,
https://111.111.111.111/enzi/v0/saml/acs
./enzi/v0/saml/metadata
. For example,
https://111.111.111.111/enzi/v0/saml/metadata
.${f:substringBefore(user.email, "@")}
specifies the username
portion of the email address.fullname
, Value: user.displayName
.member-of
, Filter: (user
defined) for associate group membership. The group name is
returned with the assertion. Name: is-admin
, Filter: (user
defined) for identifying if the user is an admin.ADFS integration requires the following steps:
/enzi/v0/saml/metadata
. For example,
https://111.111.111.111/enzi/v0/saml/metadata
.c:[Type == "http://schemas.xmlsoap.org/claims/CommonName"] => issue(Type = "fullname", Issuer = c.Issuer, OriginalIssuer = c.OriginalIssuer, Value = c.Value, ValueType = c.ValueType);
To enable SAML authentication:
Go to the UCP web interface.
Navigate to the Admin Settings.
Select Authentication & Authorization.
In the SAML Enabled section, select Yes to display the required settings. The settings are grouped by those needed by the identity provider server and by those needed by UCP as a SAML service provider.
In IdP Metadata URL enter the URL for the identity provider’s metadata.
If the metadata URL is publicly certified, you can leave Skip TLS Verification unchecked and Root Certificates Bundle blank, which is the default. Skipping TLS verification is not recommended in production environments. If the metadata URL cannot be certified by the default certificate authority store, you must provide the certificates from the identity provider in the Root Certificates Bundle field.
In UCP Host enter the URL that includes the IP address or domain of your UCP installation. The port number is optional. The current IP address or domain appears by default.
To customize the text of the sign-in button, enter your button text in the Customize Sign In Button Text field. The default text is ‘Sign in with SAML’.
The Service Provider Metadata URL and Assertion Consumer Service (ACS) URL appear in shaded boxes. Select the copy icon at the right side of each box to copy that URL to the clipboard for pasting in the identity provider workflow.
Select Save to complete the integration.
You can download a client bundle to access UCP. A client bundle is a group of certificates downloadable directly from UCP web interface that enables command line as well as API access to UCP. It lets you authorize a remote Docker engine to access specific user accounts managed in Docker Enterprise, absorbing all associated RBAC controls in the process. You can now execute docker swarm commands from your remote machine that take effect on the remote cluster. You can download the client bundle in the Admin Settings under My Profile.
Warning
Users who have been previously authorized using a Client Bundle will continue to be able to access UCP regardless of the newly configured SAML access controls. To ensure that access from the client bundle is synced with the identity provider, we recommend the following steps. Otherwise, a previously-authorized user could get access to UCP through their existing client bundle.
Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between parties. The SAML integration process is described below.
Service Provider metadata is available at
https://<SP Host>/enzi/v0/saml/metadata
after SAML is enabled. The
metadata link is also labeled as entityID
.
Note
Only POST
binding is supported for the ‘Assertion Consumer
Service’, which is located at https://<SP Host>/enzi/v0/saml/acs
.
After UCP sends an AuthnRequest
to the IdP, the following
Assertion
is expected:
Subject
includes a NameID
that is identified as the username
for UCP. In AuthnRequest
, NameIDFormat
is set to
urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified
. This
allows maximum compatibility for various Identity Providers.
<saml2:Subject>
<saml2:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified">mobywhale</saml2:NameID>
<saml2:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer">
<saml2:SubjectConfirmationData NotOnOrAfter="2018-09-10T20:04:48.001Z" Recipient="https://18.237.224.122/enzi/v0/saml/acs"/>
</saml2:SubjectConfirmation>
</saml2:Subject>
An optional Attribute
named fullname
is mapped to the Full
Name field in the UCP account.
Note
UCP uses the value of the first occurrence of an Attribute
with
Name="fullname"
as the Full Name.
<saml2:Attribute Name="fullname" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:unspecified">
<saml2:AttributeValue
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:string">user.displayName
</saml2:AttributeValue>
</saml2:Attribute>
An optional Attribute
named member-of
is linked to the UCP
team. The values are set in the UCP interface.
Note
UCP uses all AttributeStatements
and Attributes
in the
Assertion
with Name="member-of"
.
<saml2:Attribute Name="member-of" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:unspecified">
<saml2:AttributeValue
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:string">groupName
</saml2:AttributeValue>
</saml2:Attribute>
An optional Attribute
with the name is-admin
is used to
identify if the user is an administrator.
Note
When there is an Attribute
with the name is-admin
, the user
is an administrator. The content in the AttributeValue
is
ignored.
<saml2:Attribute Name="is-admin" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:unspecified">
<saml2:AttributeValue
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:string">value_doe_not_matter
</saml2:AttributeValue>
</saml2:Attribute>
When two or more group names are expected to return with the Assertion,
use the regex
filter. For example, use the value apple|orange
to
return groups apple
and orange
.
Enter the Identity Provider’s metadata URL to obtain its metadata. To access the URL, you may need to provide the CA certificate that can verify the remote server.
Use the ‘edit’ or ‘create’ team dialog to associate SAML group assertion with the UCP team to synchronize user team membership when the user logs in.
To use Helm and Tiller with UCP, you must modify the kube-system
default
service account to define the necessary roles. Enter the following kubectl
commands in this order:
kubectl create rolebinding default-view --clusterrole=view --serviceaccount=kube-system:default --namespace=kube-system
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
For information on the use of Helm, refer to the official Helm user documentation.
UCP integrates with LDAP directory services, so that you can manage users and groups from your organization’s directory and it will automatically propagate that information to UCP and DTR.
If you enable LDAP, UCP uses a remote directory server to create users automatically, and all logins are forwarded to the directory server.
When you switch from built-in authentication to LDAP authentication, all manually created users whose usernames don’t match any LDAP search results are still available.
When you enable LDAP authentication, you can choose whether UCP creates user accounts only when users log in for the first time. Select the Just-In-Time User Provisioning option to ensure that the only LDAP accounts that exist in UCP are those that have had a user log in to UCP.
You control how UCP integrates with LDAP by creating searches for users.
You can specify multiple search configurations, and you can specify
multiple LDAP servers to integrate with. Searches start with the
Base DN
, which is the distinguished name of the node in the LDAP
directory tree where the search starts looking for users.
Access LDAP settings by navigating to the Authentication & Authorization page in the UCP web interface. There are two sections for controlling LDAP searches and servers.
Base DN
, scope
, filter
, the username
attribute, and the full name
attribute. These searches are stored
in a list, and the ordering may be important, depending on your
search configuration.Here’s what happens when UCP synchronizes with LDAP:
Base DN
from the user search config and selecting
the domain server that has the longest domain suffix match.Base DN
from the search config, UCP uses the default domain server.The domain server to use is determined by the Base DN
in each search
config. UCP doesn’t perform search requests against each of the domain
servers, only the one which has the longest matching domain suffix, or
the default if there’s no match.
Here’s an example. Let’s say we have three LDAP domain servers:
Domain | Server URL |
---|---|
default | ldaps://ldap.example.com |
dc=subsidiary1,dc=com |
ldaps://ldap.subsidiary1.com |
dc=subsidiary2,dc=subsidiary1,dc=com |
ldaps://ldap.subsidiary2.com |
Here are three user search configs with the following Base DNs
:
baseDN=ou=people,dc=subsidiary1,dc=com
For this search config, dc=subsidiary1,dc=com
is the only server
with a domain which is a suffix, so UCP uses the server
ldaps://ldap.subsidiary1.com
for the search request.
baseDN=ou=product,dc=subsidiary2,dc=subsidiary1,dc=com
For this search config, two of the domain servers have a domain which
is a suffix of this base DN, but
dc=subsidiary2,dc=subsidiary1,dc=com
is the longer of the two, so
UCP uses the server ldaps://ldap.subsidiary2.com
for the search
request.
baseDN=ou=eng,dc=example,dc=com
For this search config, there is no server with a domain specified
which is a suffix of this base DN, so UCP uses the default server,
ldaps://ldap.example.com
, for the search request.
If there are username
collisions for the search results between
domains, UCP uses only the first search result, so the ordering of the
user search configs may be important. For example, if both the first and
third user search configs result in a record with the username
jane.doe
, the first has higher precedence and the second is ignored.
For this reason, it’s important to choose a username
attribute
that’s unique for your users across all domains.
Because names may collide, it’s a good idea to use something unique to
the subsidiary, like the email address for each person. Users can log in
with the email address, for example, jane.doe@subsidiary1.com
.
To configure UCP to create and authenticate users by using an LDAP directory, go to the UCP web interface, navigate to the Admin Settings page, and click Authentication & Authorization to select the method used to create and authenticate users.
In the LDAP Enabled section, click Yes. Now configure your LDAP directory integration.
Use this setting to change the default permissions of new users.
Click the drop-down menu to select the permission level that UCP assigns
by default to the private collections of new users. For example, if you
change the value to View Only
, all users who log in for the first
time after the setting is changed have View Only
access to their
private collections, but permissions remain unchanged for all existing
users.
Click Yes to enable integrating UCP users and teams with LDAP servers.
Field | Description |
---|---|
LDAP server URL | The URL where the LDAP server can be reached. |
Reader DN | The distinguished name of the LDAP account used for searching entries in the LDAP server. As a best practice, this should be an LDAP read-only user. |
Reader password | The password of the account used for searching entries in the LDAP server. |
Use Start TLS | Whether to authenticate/encrypt the connection after connecting to the
LDAP server over TCP. If you set the LDAP Server URL field with
ldaps:// , this field is ignored. |
Skip TLS verification | Whether to verify the LDAP server certificate when using TLS. The connection is still encrypted but vulnerable to man-in-the-middle attacks. |
No simple pagination | If your LDAP server doesn’t support pagination. |
Just-In-Time User Provisioning | Whether to create user accounts only when users log in for the first
time. The default value of true is recommended. If you upgraded from
UCP 2.0.x, the default is false . |
Note
LDAP connections using certificates created with TLS v1.2 do not currently advertise support for sha512WithRSAEncryption in the TLS handshake which leads to issues establishing connections with some clients. Support for advertising sha512WithRSAEncryption will be added in UCP 3.1.0.
Click Confirm to add your LDAP domain.
To integrate with more LDAP servers, click Add LDAP Domain.
Field | Description |
---|---|
Base DN | The distinguished name of the node in the directory tree where the search should start looking for users. |
Username attribute | The LDAP attribute to use as username on UCP. Only user entries with a
valid username will be created. A valid username is no longer than 100
characters and does not contain any unprintable characters, whitespace
characters, or any of the following characters: / \ [ ]
: ; | = , + * ? < > `` ``. |
Full name attribute | The LDAP attribute to use as the user’s full name for display purposes. If left empty, UCP will not create new users with a full name value. |
Filter | The LDAP search filter used to find users. If you leave this field empty, all directory entries in the search scope with valid username attributes are created as users. |
Search subtree instead of just one level | Whether to perform the LDAP search on a single level of the LDAP tree, or search through the full LDAP tree starting at the Base DN. |
Match Group Members | Whether to further filter users by selecting those who are also members
of a specific group on the directory server. This feature is helpful if
the LDAP server does not support memberOf search filters. |
Iterate through group members | If Select Group Members is selected, this option searches for users
by first iterating over the target group’s membership, making a separate
LDAP query for each member, as opposed to first querying for all users
which match the above search query and intersecting those with the set
of group members. This option can be more efficient in situations where
the number of members of the target group is significantly smaller than
the number of users which would match the above search filter, or if
your directory server does not support simple pagination of search
results. |
Group DN | If Select Group Members is selected, this specifies the
distinguished name of the group from which to select users. |
Group Member Attribute | If Select Group Members is selected, the value of this group
attribute corresponds to the distinguished names of the members of the
group. |
To configure more user search queries, click Add LDAP User Search Configuration again. This is useful in cases where users may be found in multiple distinct subtrees of your organization’s directory. Any user entry which matches at least one of the search configurations will be synced as a user.
Field | Description |
---|---|
Username | An LDAP username for testing authentication to this application. This value corresponds with the Username Attribute specified in the LDAP user search configurations section. |
Password | The user’s password used to authenticate (BIND) to the directory server. |
Before you save the configuration changes, you should test that the integration is correctly configured. You can do this by providing the credentials of an LDAP user, and clicking the Test button.
Field | Description |
---|---|
Sync interval | The interval, in hours, to synchronize users between UCP and the LDAP server. When the synchronization job runs, new users found in the LDAP server are created in UCP with the default permission level. UCP users that don’t exist in the LDAP server become inactive. |
Enable sync of admin users | This option specifies that system admins should be synced directly with members of a group in your organization’s LDAP directory. The admins will be synced to match the membership of the group. The configured recovery admin user will also remain a system admin. |
Once you’ve configured the LDAP integration, UCP synchronizes users based on the interval you’ve defined starting at the top of the hour. When the synchronization runs, UCP stores logs that can help you troubleshoot when something goes wrong.
You can also manually synchronize users by clicking Sync Now.
When a user is removed from LDAP, the effect on the user’s UCP account depends on the Just-In-Time User Provisioning setting:
false
: Users deleted from
LDAP become inactive in UCP after the next LDAP synchronization runs.true
: Users deleted from
LDAP can’t authenticate, but their UCP accounts remain active. This
means that they can use their client bundles to run commands. To
prevent this, deactivate their UCP user accounts.UCP saves a minimum amount of user data required to operate. This includes the value of the username and full name attributes that you have specified in the configuration as well as the distinguished name of each synced user. UCP does not store any additional data from the directory server.
UCP enables syncing teams with a search query or group in your organization’s LDAP directory.
As of UCP 3.1.5, LDAP-specific GET
and PUT
API endpoints have
been added to the Config resource. Note that swarm mode has to be
enabled before you can hit the following endpoints:
GET /api/ucp/config/auth/ldap
- Returns information on your
current system LDAP configuration.PUT /api/ucp/config/auth/ldap
- Lets you update your LDAP
configuration.You can configure UCP to allow users to deploy and run services only in worker nodes. This ensures all cluster management functionality stays performant, and makes the cluster more secure.
Important
In the event that a user deploys a malicious service capable of affecting the node on which it is running, that service will not be able to strike any other nodes in the cluster or have any impact on cluster management functionality.
To restrict users from deploying to manager nodes, log in with administrator credentials to the UCP web interface, navigate to the Admin Settings page, and choose Scheduler.
You can then choose if user services should be allowed to run on manager nodes or not.
Note
Creating a grant with the Scheduler
role against the /
collection
takes precedence oer any other grants with Node Schedule
on
subcollections.
By default UCP clusters takes advantage of Taints and Tolerations to prevent a User’s workload being deployed on to UCP Manager or DTR Nodes.
You can view this taint by running:
$ kubectl get nodes <ucpmanager> -o json | jq -r '.spec.taints | .[]'
{
"effect": "NoSchedule",
"key": "com.docker.ucp.manager"
}
Note
Workloads deployed by an Administrator in the kube-system
namespace do not follow these scheduling constraints. If an
Administrator deploys a workload in the kube-system
namespace, a
toleration is applied to bypass this taint, and the workload is
scheduled on all node types.
To allow Administrators to deploy workloads accross all nodes types, an Administrator can tick the “Allow administrators to deploy containers on UCP managers or nodes running DTR” box in the UCP web interface.
For all new workloads deployed by Administrators after this box has been ticked, UCP will apply a toleration to your workloads to allow the pods to be scheduled on all node types.
For existing workloads, the Administrator will need to edit the Pod
specification, through kubectl edit <object> <workload>
or the UCP
web interface and add the following toleration:
tolerations:
- key: "com.docker.ucp.manager"
operator: "Exists"
You can check this has been applied succesfully by:
$ kubectl get <object> <workload> -o json | jq -r '.spec.template.spec.tolerations | .[]'
{
"key": "com.docker.ucp.manager",
"operator": "Exists"
}
To allow Kubernetes Users and Service Accounts to deploy workloads accross all node types in your cluster, an Administrator will need to tick “Allow all authenticated users, including service accounts, to schedule on all nodes, including UCP managers and DTR nodes.” in the UCP web interface.
For all new workloads deployed by Kubernetes Users after this box has been ticked, UCP will apply a toleration to your workloads to allow the pods to be scheduled on all node types. For existing workloads, the User would need to edit Pod Specification as detailed above in the “Allow Administrators to Schedule on Manager / DTR Nodes” section.
There is a NoSchedule taint on UCP managers and DTR nodes and if you
have scheduling on managers/workers disabled in the UCP scheduling
options, then a toleration for that taint will not get applied to the
deployments, so they should not schedule on those nodes. Unless the Kube
workload is deployed in the kube-system
name space.
With UCP you can enforce applications to only use Docker images signed by UCP users you trust. Each time a user attempts to deploy an application to the cluster, UCP checks whether the application is using a trusted Docker image (and will halt the deployment if that is not the case).
By signing and verifying the Docker images, you ensure that the images being used in your cluster are the ones you trust and haven’t been altered either in the image registry or on their way from the image registry to your UCP cluster.
To configure UCP to only allow running services that use Docker trusted images:
Access the UCP UI and browse to the Admin Settings page.
In the left navigation pane, click Docker Content Trust.
Select the Run only signed images option.
With this setting, UCP allows deploying any image as long as the image has been signed. It doesn’t matter who signed the image.
To enforce that the image needs to be signed by specific teams, click Add Team and select those teams from the list.
If you specify multiple teams, the image needs to be signed by a member of each team, or someone that is a member of all those teams.
Click Save.
At this point, UCP starts enforcing the policy. Existing services will continue running and can be restarted if needed, however UCP only allows the deployment of new services that use a trusted image.
UCP enables setting properties of user sessions, like session timeout and number of concurrent sessions.
To configure UCP login sessions, go to the UCP web interface, navigate to the Admin Settings page and click Authentication & Authorization.
Field | Description |
---|---|
Lifetime Minutes | The initial lifetime of a login session, starting from the time UCP generates the session. When this time expires, UCP invalidates the active session. To establish a new session, the user must authenticate again. The default is 60 minutes with a minimum of 10 minutes. |
Renewal Threshold Minutes | The time by which UCP extends an active session before session expiration. UCP extends the session by the number of minutes specified in Lifetime Minutes. The threshold value can’t be greater than **Lifetime Minutes. The default extension is 20 minutes. To specify that no sessions are extended, set the threshold value to zero. This may cause users to be logged out unexpectedly while using the UCP web interface. The maximum threshold is 5 minutes less than Lifetime Minutes. |
Per User Limit | The maximum number of simultaneous logins for a user. If creating a new session exceeds this limit, UCP deletes the least recently used session. Every time you use a session token, the server marks it with the current time (lastUsed metadata). When you create a new session that would put you over the per user limit, the session with the oldest lastUsed time is deleted. This is not necessarily the oldest session. To disable this limit, set the value to zero. The default limit is 10 sessions. |
There are two ways to configure UCP:
You can customize the UCP installation by creating a configuration file at the time of installation. During the installation, UCP detects and starts using the configuration specified in this file.
You can use the configuration file in different ways to set up your UCP cluster.
example-config
command, edit the example configuration
file, and set the configuration at install time or import after
installation.Specify your configuration settings in a TOML file.
Use the config-toml
API to export the current settings and write
them to a file. Within the directory of a UCP admin user’s client
certificate bundle, the following command exports the current configuration for
the UCP hostname UCP_HOST
to a file named ucp-config.toml
:
AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://UCP_HOST/auth/login | jq --raw-output .auth_token)
curl -X GET "https://UCP_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > ucp-config.toml
curl -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'path/to/ucp-config.toml' https://UCP_HOST/api/ucp/config-toml
You can configure UCP to import an existing configuration file at install time. To do this using the Configs feature of Docker Swarm, follow these steps.
com.docker.ucp.config
and the TOML value of your UCP
configuration file contents.--existing-config
flag to have the installer use that object for
its initial configuration.com.docker.ucp.config
object.You can see an example TOML config file that shows how to configure UCP
settings. From the command line, run UCP with the example-config
option:
docker container run --rm docker/ucp:3.2.5 example-config
Parameter | Required | Description |
---|---|---|
backend |
no | The name of the authorization backend to use, either managed or
ldap . The default is managed . |
default_new_user_role |
no | The role that new users get for their private resource sets. Values are
admin , viewonly , scheduler , restrictedcontrol , or
fullcontrol . The default is restrictedcontrol. |
Parameter | Required | Description |
---|---|---|
lifetime_minutes |
no | The initial session lifetime, in minutes. The default is 60 minutes. |
renewal_threshold_minutes |
no | The length of time, in minutes, before the expiration of a session where, if used, a session will be extended by the current configured lifetime from then. A zero value disables session extension. The default is 20 minutes. |
per_user_limit |
no | The maximum number of sessions that a user can have active simultaneously. If creating a new session would put a user over this limit, the least recently used session will be deleted. A value of zero disables limiting the number of sessions that users may have. The default is 10. |
store_token_per_session |
no | If set, the user token is stored in sessionStorage instead of
localStorage . Note that this option will log the user out and
require them to log back in since they are actively changing how their
authentication is stored. |
An array of tables that specifies the DTR instances that the current UCP instance manages.
Parameter | Required | Description |
---|---|---|
host_address |
yes | The address for connecting to the DTR instance tied to this UCP cluster. |
service_id |
yes | The DTR instance’s OpenID Connect Client ID, as registered with the Docker authentication provider. |
ca_bundle |
no | If you’re using a custom certificate authority (CA), ca_bundle
specifies the root CA bundle for the DTR instance. The value is a string
with the contents of a ca.pem file. |
Configures audit logging options for UCP components.
Parameter | Required | Description |
---|---|---|
level |
no | Specify the audit logging level. Leave empty for disabling audit logs
(default). Other legal values are metadata and request . |
support_dump_include_audit_logs |
no | When set to true, support dumps will include audit logs in the logs of
the ucp-controller container of each manager node. The default is
false . |
Specifies scheduling options and the default orchestrator for new nodes.
Note
If you run the kubectl
command, such as
kubectl describe nodes
, to view scheduling rules on Kubernetes
nodes, it does not reflect what is configured in UCP Admin settings.
UCP uses taints to control container scheduling on nodes and is
unrelated to kubectl’s Unschedulable
boolean flag.
Parameter | Required | Description |
---|---|---|
enable_admin_ucp_schedulin |
no | Set to true to allow admins to schedule on containers on manager nodes.
The default is false . |
default_node_orchestrator |
no | Sets the type of orchestrator to use for new nodes that are joined to
the cluster. Can be swarm or kubernetes . The default is swarm . |
Specifies the analytics data that UCP collects.
Parameter | Required | Description |
---|---|---|
disable_usageinfo |
no | Set to true to disable analytics of usage information. The default
is false . |
disable_tracking |
no | Set to true to disable analytics of API call information. The
default is false . |
anonymize_tracking |
no | Anonymize analytic data. Set to true to hide your license ID. The
default is false . |
cluster_label |
no | Set a label to be included with analytics. |
Specifies whether DTR images require signing.
Parameter | Required | Description |
---|---|---|
require_content_trus |
no | Set to true to require images be signed by content trust. The
default is false . |
require_signature_from |
no | A string array that specifies users or teams which must sign images. |
Configures the logging options for UCP components.
Parameter | Required | Description |
---|---|---|
protocol |
no | The protocol to use for remote logging. Values are tcp and udp .
The default is tcp . |
host |
no | Specifies a remote syslog server to send UCP controller logs to. If
omitted, controller logs are sent through the default docker daemon
logging driver from the ucp-controller container. |
level |
no | The logging level for UCP components. Values are syslog priority levels:
debug , info , notice , warning , err , crit ,
alert , and emerg . |
Specifies whether the your UCP license is automatically renewed.
Parameter | Required | Description |
---|---|---|
auto_refresh |
no | Set to true to enable attempted automatic license renewal when the
license nears expiration. If disabled, you must manually upload renewed
license after expiration. The default is true . |
Included when you need to set custom API headers. You can repeat this
section multiple times to specify multiple separate headers. If you
include custom headers, you must specify both name
and value
.
[[custom_api_server_headers]]
Item | Description |
---|---|
name | Set to specify the name of the custom header with name =
“X-Custom-Header-Name”. |
value | Set to specify the value of the custom header with value = “Custom
Header Value”. |
A map describing default values to set on Swarm services at creation time if those fields are not explicitly set in the service spec.
[user_workload_defaults]
[user_workload_defaults.swarm_defaults]
Parameter | Required | Description |
---|---|---|
[tasktemplate.restartpolicy.delay] |
no | Delay between restart attempts (ns|us|ms|s|m|h). The default is value =
"5s" . |
[tasktemplate.restartpolicy.maxattempts] |
no | Maximum number of restarts before giving up. The default is value =
"3" . |
Configures the cluster that the current UCP instance manages.
The dns
, dns_opt
, and dns_search
settings configure the DNS
settings for UCP components. Assigning these values overrides the
settings in a container’s /etc/resolv.conf
file.
Parameter | Required | Description |
---|---|---|
controller_port |
yes | Configures the port that the ucp-controller listens to. The default
is 443 . |
kube_apiserver_port |
yes | Configures the port the Kubernetes API server listens to. |
swarm_port |
yes | Configures the port that the ucp-swarm-manager listens to. The
default is 2376 . |
swarm_strategy |
no | Configures placement strategy for container scheduling. This doesn’t
affect swarm-mode services. Values are spread , binpack , and
random . |
dns |
yes | Array of IP addresses to add as nameservers. |
dns_opt |
yes | Array of options used by DNS resolvers. |
dns_search |
yes | Array of domain names to search when a bare unqualified hostname is used inside of a container. |
profiling_enabled |
no | Set to true to enable specialized debugging endpoints for profiling
UCP performance. The default is false . |
kv_timeout |
no | Sets the key-value store timeout setting, in milliseconds. The default
is 5000 . |
kv_snapshot_count |
Required | Sets the key-value store snapshot count setting. The default is
20000 . |
external_service_lb |
no | Specifies an optional external load balancer for default links to services with exposed ports in the web interface. |
cni_installer_url |
no | Specifies the URL of a Kubernetes YAML file to be used for installing a CNI plugin. Applies only during initial installation. If empty, the default CNI plugin is used. |
metrics_retention_time |
no | Adjusts the metrics retention time. |
metrics_scrape_interval |
no | Sets the interval for how frequently managers gather metrics from nodes in the cluster. |
metrics_disk_usage_interval |
no | Sets the interval for how frequently storage metrics are gathered. This operation can be expensive when large volumes are present. |
rethinkdb_cache_size |
no | Sets the size of the cache used by UCP’s RethinkDB servers. The default is 1GB, but leaving this field empty or specifying auto instructs RethinkDB to determine a cache size automatically. |
exclude_server_identity_headers |
no | Set to true to disable the X-Server-Ip and X-Server-Name
headers. |
cloud_provider |
no | Set the cloud provider for the kubernetes cluster. |
pod_cidr |
yes | Sets the subnet pool from which the IP for the Pod should be allocated from the CNI ipam plugin. Default is 192.168.0.0/16 . |
calico_mtu |
no | Set the MTU (maximum transmission unit) size for the Calico plugin. |
ipip_mtu |
no | Set the IPIP MTU size for the calico IPIP tunnel interface. |
azure_ip_count |
yes | Set the IP count for azure allocator to allocate IPs per Azure virtual machine. |
service_cluster_ip_range |
yes | Sets the subnet pool from which the IP for Services should be allocated. Default is 10.96.0.0/16 . |
nodeport_range |
yes | Set the port range that for Kubernetes services of type NodePort can be exposed in. Default is 32768-35535 . |
custom_kube_api_server_flags |
no | Set the configuration options for the Kubernetes API server. (dev) |
custom_kube_controller_manager_flags |
no | Set the configuration options for the Kubernetes controller manager. (dev) |
custom_kubelet_flags |
no | Set the configuration options for Kubelets. (dev) |
custom_kube_scheduler_flags |
no | Set the configuration options for the Kubernetes scheduler. (dev) |
local_volume_collection_mapping |
no | Store data about collections for volumes in UCP’s local KV store instead of on the volume labels. This is used for enforcing access control on volumes. |
manager_kube_reserved_resources |
no | Reserve resources for UCP and Kubernetes components which are running on manager nodes. |
worker_kube_reserved_resources |
no | Reserve resources for UCP and Kubernetes components which are running on worker nodes. |
kubelet_max_pods |
yes | Set Number of Pods that can run on a node. Default is 110 . |
secure_overlay |
no | Set to true to enable IPSec network encryption in Kubernetes. Default is
false . |
image_scan_aggregation_enabled |
no | Set to true to enable image scan result aggregation. This feature
displays image vulnerabilities in shared resource/containers and shared
resources/images pages. Default is false . |
swarm_polling_disabled |
no | Set to true to turn off auto-refresh (which defaults to 15 seconds)
and only call the Swarm API once. Default is false . |
Note
dev indicates that the functionality is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the Docker Enterprise Software Support Agreement.
Configures iSCSI options for UCP.
Parameter | Required | Description |
---|---|---|
--storage-iscsi=true |
no | Enables iSCSI based Persistent Volumes in Kubernetes. Default value is
false . |
--iscsiadm-path=<path> |
no | Specifies the path of the iscsiadm binary on the host. Default value is
/usr/sbin/iscsiadm . |
--iscsidb-path=<path> |
no | specifies the path of the iscsi database on the host. Default value is
/etc/iscsi . |
Configures a pre-logon message.
Parameter | Required | Description |
---|---|---|
pre_logon_message |
no | Sets pre-logon message to alert users before they proceed with login. |
Universal Control Plane (UCP) can use your local networking drivers to orchestrate your cluster. You can create a config network, with a driver like MAC VLAN, and you use it like any other named network in UCP. If it’s set up as attachable, you can attach containers.
Security
Encrypting communication between containers on different nodes works only on overlay networks.
Always use UCP to create node-specific networks. You can use the UCP web UI or the CLI (with an admin bundle). If you create the networks without UCP, the networks won’t have the right access labels and won’t be available in UCP.
config-only
network name with a
node hostname prefix, like node1/my-cfg-network
,
node2/my-cfg-network
, etc. This is necessary to ensure that
the access labels are applied consistently to all of the back-end
config-only networks. UCP routes the config-only network creation
to the appropriate node based on the node hostname prefix. All
config-only networks with the same name must belong in the same
collection, or UCP returns an error. Leaving the access label
empty puts the network in the admin’s default collection, which is
/
in a new UCP installation.All UCP services are exposed using HTTPS, to ensure all communications between clients and UCP are encrypted. By default, this is done using self-signed TLS certificates that are not trusted by client tools like web browsers. So when you try to access UCP, your browser warns that it doesn’t trust UCP or that UCP has an invalid certificate.
The same happens with other client tools.
$ curl https://ucp.example.org
SSL certificate problem: Invalid certificate chain
You can configure UCP to use your own TLS certificates, so that it is automatically trusted by your browser and client tools.
To ensure minimal impact to your business, you should plan for this change to happen outside business peak hours. Your applications will continue running normally, but existing UCP client certificates will become invalid, so users will have to download new ones to access UCP from the CLI.
To configure UCP to use your own TLS certificates and keys:
Log into the UCP web UI with administrator credentials and navigate to the Admin Settings page.
Click Certificates.
Upload your certificates and keys based on the following table:
Type | Description |
---|---|
Private key | The unencrypted private key of UCP. This key must correspond to the public key used in the server certificate. Click Upload Key. |
Server certificate | The public key certificate of UCP followed by the certificates of any intermediate certificate authorities which establishes a chain of trust up to the root CA certificate. Click Upload Certificate to upload a PEM file. |
CA certificate | The public key certificate of the root certificate authority that issued the UCP server certificate. If you don’t have one, use the top-most intermediate certificate instead. Click Upload CA Certificate to upload a PEM file. |
Client CA | This field is available in UCP 3.2. This field may contain one or more Root CA certificates which the UCP Controller will use to verify that client certificates are issued by a trusted entity. UCP is automatically configured to trust its internal CAs which issue client certificates as part of generated client bundles, however, you may supply UCP with additional custom root CA certificates here so that UCP may trust client certificates issued by your corporate or trusted third-party certificate authorities. Note that your custom root certificates will be appended to UCP’s internal root CA certificates. Click Upload CA Certificate to upload a PEM file. Click Download UCP Server CA Certificate to download the certificate as a PEM file. |
Click Save.
After replacing the TLS certificates, your users will not be able to authenticate with their old client certificate bundles. Ask your users to access the UCP web UI and download new client certificate bundles.
If you deployed Docker Trusted Registry (DTR), you’ll also need to reconfigure it to trust the new UCP TLS certificates.
Docker Enterprise has its own image registry (DTR) so that you can store and manage the images that you deploy to your cluster. In this topic, you push an image to DTR and later deploy it to your cluster, using the Kubernetes orchestrator.
Instead of building an image from scratch, we’ll pull the official WordPress image from Docker Hub, tag it, and push it to DTR. Once that WordPress version is in DTR, only authorized users can change it.
To push images to DTR, you need CLI access to a licensed installation of Docker Enterprise.
When you’re set up for CLI-based access to a licensed Docker Enterprise instance, you can push images to DTR.
Pull the public WordPress image from Docker Hub:
docker pull wordpress
Tag the image, using the IP address or DNS name of your DTR instance:
docker tag wordpress:latest <dtr-url>:<port>/admin/wordpress:latest
Log in to a Docker Enterprise manager node.
Push the tagged image to DTR:
docker image push <dtr-url>:<port>/admin/wordpress:latest
In the DTR web UI, confirm that the wordpress:latest
image is store
in your DTR instance.
In the DTR web UI, click Repositories.
Click wordpress to open the repo.
Click Images to view the stored images.
Confirm that the latest
tag is present.
You’re ready to deploy the wordpress:latest
image into production.
With the WordPress image stored in DTR, Docker Enterprise can deploy the image to a Kubernetes cluster with a simple Deployment object:
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: wordpress-deployment
spec:
selector:
matchLabels:
app: wordpress
replicas: 2
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: <dtr-url>:<port>/admin/wordpress:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: wordpress-service
labels:
app: wordpress
spec:
type: NodePort
ports:
- port: 80
nodePort: 30081
selector:
app: wordpress
The Deployment object’s YAML specifies your DTR image in the pod
template spec: image: <dtr-url>:<port>/admin/wordpress:latest
. Also,
the YAML file defines a NodePort
service that exposes the WordPress
application, so it’s accessible from outside the cluster.
Open the Docker Enterprise web UI, and in the left pane, click Kubernetes.
Click Create to open the Create Kubernetes Object page.
In the Namespace dropdown, select default.
In the Object YAML editor, paste the Deployment object’s YAML.
Click Create. When the Kubernetes objects are created, the Load Balancers page opens.
Click wordpress-service, and in the details pane, find the Ports section.
Click the URL to open the default WordPress home page.
When you add a node to the cluster, the node’s workloads are managed by a default orchestrator, either Docker Swarm or Kubernetes. When you install Docker Enterprise, new nodes are managed by Docker Swarm, but you can change the default orchestrator to Kubernetes in the administrator settings.
Changing the default orchestrator doesn’t affect existing nodes in the cluster. You can change the orchestrator type for individual nodes in the cluster by navigating to the node’s configuration page in the Docker Enterprise web UI.
You can change the current orchestrator for any node that’s joined to a Docker Enterprise cluster. The available orchestrator types are Kubernetes, Swarm, and Mixed.
The Mixed type enables workloads to be scheduled by Kubernetes and Swarm both on the same node. Although you can choose to mix orchestrator types on the same node, this isn’t recommended for production deployments because of the likelihood of resource contention.
To change a node’s orchestrator type from the Edit Node page:
Log in to the Docker Enterprise web UI with an administrator account.
Navigate to the Nodes page, and click the node that you want to assign to a different orchestrator.
In the details pane, click Configure and select Details to open the Edit Node page.
In the Orchestrator Properties section, click the orchestrator type for the node.
Click Save to assign the node to the selected orchestrator.
When you change the orchestrator type for a node, existing workloads are evicted, and they’re not migrated to the new orchestrator automatically. If you want the workloads to be scheduled by the new orchestrator, you must migrate them manually. For example, if you deploy WordPress on a Swarm node, and you change the node’s orchestrator type to Kubernetes, Docker Enterprise doesn’t migrate the workload, and WordPress continues running on Swarm. In this case, you must migrate your WordPress deployment to Kubernetes manually.
The following table summarizes the results of changing a node’s orchestrator.
Workload | On orchestrator change |
---|---|
Containers | Container continues running in node |
Docker service | Node is drained, and tasks are rescheduled to another node |
Pods and other imperative resources | Continue running in node |
Deployments and other declarative resources | Might change, but for now, continue running in node |
If a node is running containers, and you change the node to Kubernetes,
these containers will continue running, and Kubernetes won’t be aware of
them, so you’ll be in the same situation as if you were running in
Mixed
node.
Warning
Be careful when mixing orchestrators on a node.
When you change a node’s orchestrator, you can choose to run the node
in a mixed mode, with both Kubernetes and Swarm workloads. The
Mixed
type is not intended for production use, and it may impact
existing workloads on the node.
This is because the two orchestrator types have different views of the node’s resources, and they don’t know about each other’s workloads. One orchestrator can schedule a workload without knowing that the node’s resources are already committed to another workload that was scheduled by the other orchestrator. When this happens, the node could run out of memory or other resources.
For this reason, we recommend not mixing orchestrators on a production node.
You can set the default orchestrator for new nodes to Kubernetes or Swarm.
To set the orchestrator for new nodes:
Log in to the Docker Enterprise web UI with an administrator account.
Open the Admin Settings page, and in the left pane, click Scheduler.
Under Set Orchestrator Type for New Nodes, click Swarm or Kubernetes.
Click Save.
From now on, when you join a node to the cluster, new workloads on the node are scheduled by the specified orchestrator type. Existing nodes in the cluster aren’t affected.
Once a node is joined to the cluster, you can change the orchestrator that schedules its workloads.
DTR in mixed mode
The default behavior for DTR nodes is to be in mixed orchestration. Additionally, if the DTR mode type is changed to Swarm only or Kubernetes only, reconciliation will revert the node back to mixed mode. This is the expected behavior.
The workloads on your cluster can be scheduled by Kubernetes or by Swarm, or the cluster can be mixed, running both orchestrator types. If you choose to run a mixed cluster, be aware that the different orchestrators aren’t aware of each other, and there’s no coordination between them.
We recommend that you make the decision about orchestration when you set up the cluster initially. Commit to Kubernetes or Swarm on all nodes, or assign each node individually to a specific orchestrator. Once you start deploying workloads, avoid changing the orchestrator setting. If you do change the orchestrator for a node, your workloads are evicted, and you must deploy them again through the new orchestrator.
Node demotion and orchestrator type
When you promote a worker node to be a manager, its orchestrator type
automatically changes to Mixed
. If you demote the same node to be
a worker, its orchestrator type remains as Mixed
.
Set the orchestrator on a node by assigning the orchestrator labels,
com.docker.ucp.orchestrator.swarm
or
com.docker.ucp.orchestrator.kubernetes
, to true
.
To schedule Swarm workloads on a node:
docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
To schedule Kubernetes workloads on a node:
docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
To schedule Kubernetes and Swarm workloads on a node:
docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
Warning
Mixed nodes
Scheduling both Kubernetes and Swarm workloads on a node is not recommended for production deployments, because of the likelihood of resource contention.
To change the orchestrator type for a node from Swarm to Kubernetes:
docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
docker node update --label-rm com.docker.ucp.orchestrator.swarm <node-id>
UCP detects the node label change and updates the Kubernetes node accordingly.
Check the value of the orchestrator label by inspecting the node:
docker node inspect <node-id> | grep -i orchestrator
The docker node inspect
command returns the node’s configuration,
including the orchestrator:
"com.docker.ucp.orchestrator.kubernetes": "true"
Important
Orchestrator label
The com.docker.ucp.orchestrator
label isn’t displayed in the
Labels list for a node in the Docker Enterprise web UI.
The default orchestrator for new nodes is a setting in the Docker Enterprise configuration file:
default_node_orchestrator = "swarm"
The value can be swarm
or kubernetes
.
With Docker Enterprise, administrators can filter the view of Kubernetes objects by the namespace the objects are assigned to. You can specify a single namespace, or you can specify all available namespaces.
In this example, you create two Kubernetes namespaces and deploy a service to both of them.
Log in to the UCP web UI with an administrator account.
In the left pane, click Kubernetes.
Click Create to open the Create Kubernetes Object page.
In the Object YAML editor, paste the following YAML.
apiVersion: v1
kind: Namespace
metadata:
name: blue
---
apiVersion: v1
kind: Namespace
metadata:
name: green
Click Create to create the blue
and green
namespaces.
Create a NodePort
service in the blue
namespace.
Navigate to the Create Kubernetes Object page.
In the Namespace dropdown, select blue.
In the Object YAML editor, paste the following YAML.
apiVersion: v1
kind: Service
metadata:
name: app-service-blue
labels:
app: app-blue
spec:
type: NodePort
ports:
- port: 80
nodePort: 32768
selector:
app: app-blue
Click Create to deploy the service in the blue
namespace.
Repeat the previous steps with the following YAML, but this time,
select green
from the Namespace dropdown.
apiVersion: v1
kind: Service
metadata:
name: app-service-green
labels:
app: app-green
spec:
type: NodePort
ports:
- port: 80
nodePort: 32769
selector:
app: app-green
Currently, the Namespaces view is set to the default namespace, so the Load Balancers page doesn’t show your services.
With the Set context for all namespaces toggle set, you see all of the Kubernetes objects in every namespace. Now filter the view to show only objects in one namespace.
In the left pane, click Namespaces to open the list of namespaces.
In the green namespace, click the More options icon and in the context menu, select Set Context.
Click Confirm to set the context to the green
namespace. The
indicator in the left pane changes to green.
Click Load Balancers to view your app-service-green
service.
The app-service-blue
service doesn’t appear.
To view the app-service-blue
service, repeat the previous steps, but
this time, select Set Context on the blue namespace.
UCP is designed for high availability (HA). You can join multiple manager nodes to the cluster, so that if one manager node fails, another can automatically take its place without impact to the cluster.
Having multiple manager nodes in your cluster allows you to:
To make the cluster tolerant to more failures, add additional replica nodes to your cluster.
Manager nodes | Failures tolerated |
---|---|
1 | 0 |
3 | 1 |
5 | 2 |
For production-grade deployments, follow these best practices:
Docker Enterprise is designed for scaling horizontally as your applications grow in size and usage. You can add or remove nodes from the cluster to scale it to your needs. You can join Windows Server and Linux nodes to the cluster.
Because Docker Enterprise leverages the clustering functionality provided by Docker Engine, you use the docker swarm join command to add more nodes to your cluster. When you join a new node, Docker Enterprise services start running on the node automatically.
When you join a node to a cluster, you specify its role: manager or worker.
Manager: Manager nodes are responsible for cluster management functionality and dispatching tasks to worker nodes. Having multiple manager nodes allows your swarm to be highly available and tolerant of node failures.
Manager nodes also run all Docker Enterprise components in a replicated way, so by adding additional manager nodes, you’re also making the cluster highly available.
Worker: Worker nodes receive and execute your services and applications. Having multiple worker nodes allows you to scale the computing capacity of your cluster.
When deploying Docker Trusted Registry in your cluster, you deploy it to a worker node.
You can join Windows Server and Linux nodes to the cluster, but only Linux nodes can be managers.
To join nodes to the cluster, go to the UCP web interface and navigate to the Nodes page.
Click Add Node to add a new node.
Select the type of node to add, Windows or Linux.
Click Manager if you want to add the node as a manager.
Check the Use a custom listen address option to specify the address and port where new node listens for inbound cluster management traffic.
Check the Use a custom listen address option to specify the IP address that’s advertised to all members of the cluster for API access.
Copy the displayed command, use SSH to log in to the host that you want
to join to the cluster, and run the docker swarm join
command on the
host.
To add a Windows node, click Windows and follow the instructions in Join Windows worker nodes to a cluster.
After you run the join command in the node, the node is displayed on the Nodes page in the UCP web interface. From there, you can change the node’s cluster configuration, including its assigned orchestrator type.
Once a node is part of the cluster, you can configure the node’s availability so that it is:
Pause or drain a node from the Edit Node page:
You can promote worker nodes to managers to make UCP fault tolerant. You can also demote a manager node into a worker.
To promote or demote a manager node:
If you are load balancing user requests to Docker Enterprise across multiple manager nodes, remember to remove these nodes from the load-balancing pool when demoting them to workers.
Worker nodes can be removed from a cluster at any time.
Manager nodes are ingtegral to the cluster’s overall health, and thus you must be careful when removing one from the cluster.
You can use the Docker CLI client to manage your nodes from the CLI. To do this, configure your Docker CLI client with a UCP client bundle.
Once you do that, you can start managing your UCP nodes:
docker node ls
You can use the API to manage your nodes in the following ways:
Use the node update API to add the orchestrator label (that is,
com.docker.ucp.orchestrator.kubernetes
):
/nodes/{id}/update
Use the /api/ucp/config-toml API to change the default orchestrator setting.
Docker Enterprise 3.0 supports worker nodes that run on Windows Server 2019. Only worker nodes are supported on Windows, and all manager nodes in the cluster must run on Linux.
To enable a worker node on Windows:
Install Docker Engine - Enterprise on a Windows Server 2019 before joining the node to a Docker Enterprise Cluster.
To configure the docker daemon and the Windows environment:
ucp-agent
, which is named
ucp-agent-win
.ucp-agent-win
.As of Docker Enterprise 2.1, which includes UCP 3.1, this step is no
longer necessary. Windows nodes are automatically assigned the
ostype
label ostype=windows
.
On a manager node, run the following command to list the images that are required on Windows nodes.
docker container run --rm docker/ucp:3.2.5 images --list --enable-windows
docker/ucp-agent-win:3.2.5
docker/ucp-dsinfo-win:3.2.5
On a Windows Server node, in a PowerShell terminal running as
Administrator, log in to Docker Hub with the docker login
command
and pull the listed images.
docker image pull docker/ucp-agent-win:3.2.5
docker image pull docker/ucp-dsinfo-win:3.2.5
If the cluster is deployed in an offline site, where the nodes do not have access to the Docker Hub, UCP images can be sideloaded onto the Windows Server nodes. Follow the instructions on the install offline page to sideload the images. TODO: fix install links to UCP offline install topic
The script opens ports 2376 and 12376, and create certificates for the Docker daemon to communicate securely. The script also re-registers the docker service in Windows to use named pipes, sets it to enforce TLS communication over port 2376 and provides paths to UCP certificates.
Use this command to run the Windows node setup script:
$script = [ScriptBlock]::Create((docker run --rm docker/ucp-agent-win:3.2.5 windows-script | Out-String))
Invoke-Command $script
Note
If you run windows-script
when restarting Docker daemon, the
Docker service is unavailable temporarily.
The Windows node is ready to join the cluster. Run the setup script on each instance of Windows Server that will be a worker node.
The script may be incompatible with installations that use a config file
at C:\ProgramData\docker\config\daemon.json
. If you use such a file,
make sure that the daemon runs on port 2376 and that it uses
certificates located in C:\ProgramData\docker\daemoncerts
. If
certificates don’t exist in this directory, run
ucp-agent-win generate-certs
, as shown in Step 2 of the procedure in
Set up certs for the dockerd service.
In the daemon.json file, set the tlscacert
, tlscert
, and
tlskey
options to the corresponding files in
C:\ProgramData\docker\daemoncerts
:
{
...
"debug": true,
"tls": true,
"tlscacert": "C:\\ProgramData\\docker\\daemoncerts\\ca.pem",
"tlscert": "C:\\ProgramData\\docker\\daemoncerts\\cert.pem",
"tlskey": "C:\\ProgramData\\docker\\daemoncerts\\key.pem",
"tlsverify": true,
...
}
To join the cluster using the docker swarm join
command provided by
the UCP web interface and CLI:
Log in to the UCP web interface with an administrator account.
Navigate to the Nodes page.
Click Add Node to add a new node.
In the Node Type section, click Windows.
In the Step 2 section, select the check box for “I have followed the instructions and I’m ready to join my Windows node.”
Select the Use a custom listen address option to specify the address and port where new node listens for inbound cluster management traffic.
Select the Use a custom listen address option to specify the IP address that’s advertised to all members of the cluster for API access.
Copy the displayed command. It looks similar to the following:
docker swarm join --token <token> <ucp-manager-ip>
You can also use the command line to get the join token. Using your UCP client bundle, run:
docker swarm join-token worker
Run the docker swarm join
command on each instance of Windows Server
that will be a worker node.
The following sections describe how to run the commands in the setup
script manually to configure the dockerd
service and the Windows
environment. dockerd
is the persistent process that manages
containers. The script opens ports in the firewall and sets up
certificates for dockerd
.
To see the script, you can run the windows-script
command without
piping to the Invoke-Expression
cmdlet.
docker container run --rm docker/ucp-agent-win:3.2.5 windows-script
Docker Enterprise requires that ports 2376 and 12376 are open for inbound TCP traffic.
In a PowerShell terminal running as Administrator, run these commands to add rules to the Windows firewall.
netsh advfirewall firewall add rule name="docker_local" dir=in action=allow protocol=TCP localport=2376
netsh advfirewall firewall add rule name="docker_proxy" dir=in action=allow protocol=TCP localport=12376
To set up certs for the dockerd service:
Create the directory C:\ProgramData\docker\daemoncerts
.
In a PowerShell terminal running as Administrator, run the following command to generate certificates.
docker container run --rm -v C:\ProgramData\docker\daemoncerts:C:\certs docker/ucp-agent-win:3.2.5 generate-certs
To set up certificates, run the following commands to stop and
unregister the dockerd
service, register the service with the
certificates, and restart the service.
Stop-Service docker
dockerd --unregister-service
dockerd -H npipe:// -H 0.0.0.0:2376 --tlsverify --tlscacert=C:\ProgramData\docker\daemoncerts\ca.pem --tlscert=C:\ProgramData\docker\daemoncerts\cert.pem --tlskey=C:\ProgramData\docker\daemoncerts\key.pem --register-service
Start-Service docker
The dockerd
service and the Windows environment are now configured
to join a Docker Enterprise cluster.
Note
If the TLS certificates aren’t set up correctly, the UCP web interface shows the following warning:
Node WIN-NOOQV2PJGTE is a Windows node that cannot connect to its local Docker daemon.
The following features are not yet supported on Windows Server 2019:
ucp-hrm
network to make it unencrypted.Once you’ve joined multiple manager nodes for high availability (HA), you can configure your own load balancer to balance user requests across all manager nodes.
This allows users to access UCP using a centralized domain name. If a manager node goes down, the load balancer can detect that and stop forwarding requests to that node, so that the failure goes unnoticed by users.
Since UCP uses mutual TLS, make sure you configure your load balancer to:
443
and 6443
./_ping
endpoint on each manager node, to check if the
node is healthy and if it should remain on the load balancing pool or
not.By default, both UCP and DTR use port 443. If you plan on deploying UCP and DTR, your load balancer needs to distinguish traffic between the two by IP address or port number.
Important
Additional requirements
In addition to configuring your load balancer to distinguish between UCP and DTR, configuring a load balancer for DTR has further requirements (refer to the DTR documentation).
Use the following examples to configure your load balancer for UCP.
You can deploy your load balancer using:
UCP uses Calico as the default Kubernetes networking solution. Calico is configured to create a BGP mesh between all nodes in the cluster.
As you add more nodes to the cluster, networking performance starts decreasing. If your cluster has more than 100 nodes, you should reconfigure Calico to use Route Reflectors instead of a node-to-node mesh.
This article guides you in deploying Calico Route Reflectors in a UCP cluster. UCP running on Microsoft Azure uses Azure SDN instead of Calico for multi-host networking. If your UCP deployment is running on Azure, you don’t need to configure it this way.
For production-grade systems, you should deploy at least two Route Reflectors, each running on a dedicated node. These nodes should not be running any other workloads.
If Route Reflectors are running on a same node as other workloads, swarm ingress and NodePorts might not work in these workloads.
Taint the nodes to ensure that they are unable to run other workloads.
For each dedicated node, run:
kubectl taint node <node-name> \
com.docker.ucp.kubernetes.calico/route-reflector=true:NoSchedule
Add labels to those nodes:
kubectl label nodes <node-name> \
com.docker.ucp.kubernetes.calico/route-reflector=true
Create a calico-rr.yaml
file with the following content:
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: calico-rr
namespace: kube-system
labels:
app: calico-rr
spec:
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
k8s-app: calico-rr
template:
metadata:
labels:
k8s-app: calico-rr
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
tolerations:
- key: com.docker.ucp.kubernetes.calico/route-reflector
value: "true"
effect: NoSchedule
hostNetwork: true
containers:
- name: calico-rr
image: calico/routereflector:v0.6.1
env:
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
- name: ETCD_CA_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_ca
# Location of the client key for etcd.
- name: ETCD_KEY_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_key # Location of the client certificate for etcd.
- name: ETCD_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_cert
- name: IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- mountPath: /calico-secrets
name: etcd-certs
securityContext:
privileged: true
nodeSelector:
com.docker.ucp.kubernetes.calico/route-reflector: "true"
volumes:
# Mount in the etcd TLS secrets.
- name: etcd-certs
secret:
secretName: calico-etcd-secrets
Deploy the DaemonSet using:
kubectl create -f calico-rr.yaml
To reconfigure Calico to use Route Reflectors instead of a node-to-node
mesh, you’ll need to tell calicoctl
where to find the etcd key-value
store managed by UCP. From a CLI with a UCP client bundle, create a
shell alias to start calicoctl
using the
docker/ucp-dsinfo
image:
UCP_VERSION=$(docker version --format {% raw %}'{{index (split .Server.Version "/") 1}}'{% endraw %})
alias calicoctl="\
docker run -i --rm \
--pid host \
--net host \
-e constraint:ostype==linux \
-e ETCD_ENDPOINTS=127.0.0.1:12378 \
-e ETCD_KEY_FILE=/ucp-node-certs/key.pem \
-e ETCD_CA_CERT_FILE=/ucp-node-certs/ca.pem \
-e ETCD_CERT_FILE=/ucp-node-certs/cert.pem \
-v /var/run/calico:/var/run/calico \
-v ucp-node-certs:/ucp-node-certs:ro \
docker/ucp-dsinfo:${UCP_VERSION} \
calicoctl \
"
After configuring calicoctl
, check the current Calico BGP configuration:
calicoctl get bgpconfig
If you don’t see any configuration listed, create one:
calicoctl create -f - <<EOF
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: false
asNumber: 63400
EOF
This action creates a new configuration with node-to-node mesh BGP disabled.
If you have a configuration, and meshenabled
is set to true
:
Update your configuration:
calicoctl get bgpconfig --output yaml > bgp.yaml
Edit the bgp.yaml
file, updating nodeToNodeMeshEnabled
to false
.
Update the Calico configuration:
calicoctl replace -f - < bgp.yaml
To configure Calico to use the Route Reflectors you need to know the AS number for your network first. For that, run:
calicoctl get nodes --output=wide
Using the AS number, create the Calico configuration by customizing and running the following snippet for each route reflector:
calicoctl create -f - << EOF
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: bgppeer-global
spec:
peerIP: <IP_RR>
asNumber: <AS_NUMBER>
EOF
Where:
IP_RR
is the IP of the node where the Route Reflector pod is deployed.AS_NUMBER
is the same AS number
for your nodes.Manually delete any calico-mode pods that are running on nodes dedicated to the running of route reflectors, as this will ensure that there are no instances in whic pods and route reflectors are running on the same node.
Using your UCP client bundle:
# Find the Pod name
kubectl -n kube-system \
get pods --selector k8s-app=calico-node -o wide | \
grep <node-name>
# Delete the Pod
kubectl -n kube-system delete pod <pod-name>
Verify that calico-node
pods running on other nodes are peering with the
Route Reflector.
From a CLI with a UCP client bundle, use a Swarm affinity filter to run
calicoctl node status
on any node running calico-node
:
UCP_VERSION=$(docker version --format {% raw %}'{{index (split .Server.Version "/") 1}}'{% endraw %})
docker run -i --rm \
--pid host \
--net host \
-e affinity:container=='k8s_calico-node.*' \
-e ETCD_ENDPOINTS=127.0.0.1:12378 \
-e ETCD_KEY_FILE=/ucp-node-certs/key.pem \
-e ETCD_CA_CERT_FILE=/ucp-node-certs/ca.pem \
-e ETCD_CERT_FILE=/ucp-node-certs/cert.pem \
-v /var/run/calico:/var/run/calico \
-v ucp-node-certs:/ucp-node-certs:ro \
docker/ucp-dsinfo:${UCP_VERSION} \
calicoctl node status
The delivered results should resemble the following sample output:
IPv4 BGP status
+--------------+-----------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-----------+-------+----------+-------------+
| 172.31.24.86 | global | up | 23:10:04 | Established |
+--------------+-----------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
You can monitor the status of UCP using the web UI or the CLI. You
can also use the _ping
endpoint to build monitoring automation.
The first place to check the status of UCP is the UCP web UI, since it shows warnings for situations that require your immediate attention. Administrators might see more warnings than regular users.
You can also navigate to the Nodes page, to see if all the nodes managed by UCP are healthy or not.
Each node has a status message explaining any problems with the node. In this example, a Windows worker node is down. Click the node to get more info on its status. In the details pane, click Actions and select Agent logs to see the log entries from the node.
You can also monitor the status of a UCP cluster using the Docker CLI client. Download a UCP client certificate bundle and then run:
docker node ls
As a rule of thumb, if the status message starts with [Pending]
,
then the current state is transient and the node is expected to correct
itself back into a healthy state.
You can use the https://<ucp-manager-url>/_ping
endpoint to check
the health of a single UCP manager node. When you access this endpoint,
the UCP manager validates that all its internal components are working,
and returns one of the following HTTP error codes:
If an administrator client certificate is used as a TLS client
certificate for the _ping
endpoint, a detailed error message is
returned if any component is unhealthy.
If you’re accessing the _ping
endpoint through a load balancer,
you’ll have no way of knowing which UCP manager node is not healthy,
since any manager node might be serving your request. Make sure you’re
connecting directly to the URL of a manager node, and not a load
balancer. In addition, please be aware that pinging the endpoint with
HEAD will result in a 404 error code. It is better to use GET instead.
For those implementations with a subscription, UCP displays image vulnerability count data from the DTR image scanning feature. UCP displays vulnerability counts for containers, Swarm services, pods, and images.
To enable this feature, DTR 2.6 is required and single sign-on with UCP must be enabled.
Web UI disk usage metrics, including free space, only reflect the Docker
managed portion of the filesystem: /var/lib/docker
. To monitor the
total space available on each filesystem of a UCP worker or manager, you
must deploy a third-party monitoring solution to monitor the operating
system.
There are several cases in the lifecycle of UCP when a node is actively transitioning from one state to another, such as when a new node is joining the cluster or during node promotion and demotion. In these cases, the current step of the transition will be reported by UCP as a node message. You can view the state of each individual node by monitoring the cluster status.
The following table lists all possible node states that may be reported for a UCP node, their explanation, and the expected duration of a given step.
Message | Description | Typical step duration |
---|---|---|
Completing node registration | Waiting for the node to appear in KV node inventory. This is expected to occur when a node first joins the UCP swarm. | 5 - 30 seconds |
heartbeat failure | The node has not contacted any swarm managers in the last 10 seconds.
Check Swarm state in docker info on the node. inactive means the
node has been removed from the swarm with docker swarm leave .
pending means dockerd on the node has been attempting to contact a
manager since dockerd on the node started. Confirm network security
policy allows tcp port 2377 from the node to managers. error means
an error prevented swarm from starting on the node. Check docker daemon
logs on the node. |
Until resolved |
Node is being reconfigured | The ucp-reconcile container is currently converging the current
state of the node to the desired state. This process may involve issuing
certificates, pulling missing images, and starting containers, depending
on the current node state. |
1 - 60 seconds |
Reconfiguration pending | The target node is expected to be a manager but the ucp-reconcile
container has not been started yet. |
1 - 10 seconds |
The ucp-agent task is state |
The ucp-agent task on the target node is not in a running state yet.
This is an expected message when configuration has been updated, or when
a new node was first joined to the UCP cluster. This step may take a
longer time duration than expected if the UCP images need to be pulled
from Docker Hub on the affected node. |
1 - 10 seconds |
Unable to determine node state | The ucp-reconcile container on the target node just started running
and we are not able to determine its state. |
1 - 10 seconds |
Unhealthy UCP Controller: node is unreachable | Other manager nodes of the cluster have not received a heartbeat message from the affected node within a predetermined timeout. This usually indicates that there’s either a temporary or permanent interruption in the network link to that manager node. Ensure the underlying networking infrastructure is operational, and contact support if the symptom persists. | Until resolved |
Unhealthy UCP Controller: unable to reach controller | The controller that we are currently communicating with is not reachable within a predetermined timeout. Please refresh the node listing to see if the symptom persists. If the symptom appears intermittently, this could indicate latency spikes between manager nodes, which can lead to temporary loss in the availability of UCP itself. Please ensure the underlying networking infrastructure is operational, and contact support if the symptom persists. | Until resolved |
Unhealthy UCP Controller: Docker Swarm Cluster: Local node <ip> has status Pending | The Engine ID of an engine is not unique in the swarm. When a node first
joins the cluster, it’s added to the node inventory and discovered as
Pending by Docker Swarm. The engine is “validated” if a
ucp-swarm-manager container can connect to it via TLS, and if its
Engine ID is unique in the swarm. If you see this issue repeatedly, make
sure that your engines don’t have duplicate IDs. Use docker info to see
the Engine ID. Refresh the ID by removing the /etc/docker/key.json
file and restarting the daemon. |
Until resolved |
If you detect problems in your UCP cluster, you can start your troubleshooting session by checking the logs of the individual UCP components. Only administrators can see information about UCP system containers.
To see the logs of the UCP system containers, navigate to the Containers page of the UCP web UI. By default, UCP system containers are hidden. Click the Settings icon and check Show system resources to view the UCP system containers.
Click on a container to see more details, such as its configurations and logs.
You can also check the logs of UCP system containers from the CLI. This is specially useful if the UCP web application is not working.
Get a client certificate bundle.
When using the Docker CLI client, you need to authenticate using client certificates. If your client certificate bundle is for a non-admin user, you do not have permission to see the UCP system containers.
Check the logs of UCP system containers. By default, system
containers aren’t displayed. Use the -a
flag to display them.
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8b77cfa87889 docker/ucp-agent:latest "/bin/ucp-agent re..." 3 hours ago Exited (0) 3 hours ago ucp-reconcile
b844cf76a7a5 docker/ucp-agent:latest "/bin/ucp-agent agent" 3 hours ago Up 3 hours 2376/tcp ucp-agent.tahzo3m4xjwhtsn6l3n8oc2bf.xx2hf6dg4zrphgvy2eohtpns9
de5b45871acb docker/ucp-controller:latest "/bin/controller s..." 3 hours ago Up 3 hours (unhealthy) 0.0.0.0:443->8080/tcp ucp-controller
...
Get the log from a UCP container by using the
docker logs <ucp container ID>
command. For example, the
following command emits the log for the ucp-controller
container
listed above.
$ docker logs de5b45871acb
{"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/json",
"remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}
{"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/logs",
"remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}
Before making any changes to UCP, download a support dump. This allows you to troubleshoot problems which were already happening before changing UCP configurations.
You can then increase the UCP log level to debug, making it easier to understand the status of the UCP cluster. Changing the UCP log level restarts all UCP system components and introduces a small downtime window to UCP. Your applications will not be affected by this downtime.
To increase the UCP log level, navigate to the UCP web UI, go to the Admin Settings tab, and choose Logs.
Once you change the log level to Debug, the UCP containers restart. Now that the UCP components are creating more descriptive logs, you can download a support dump and use it to troubleshoot the component causing the problem.
Depending on the problem you’re experiencing, it’s more likely that you’ll find related messages in the logs of specific components on manager nodes:
ucp-reconcile
container.ucp-controller
container.ucp-auth-api
and ucp-auth-store
containers.It’s normal for the ucp-reconcile
container to be in a stopped
state. This container starts only when the ucp-agent
detects that a
node needs to transition to a different state. The ucp-reconcile
container is responsible for creating and removing containers, issuing
certificates, and pulling missing images.
UCP automatically tries to heal itself by monitoring its internal components and trying to bring them to a healthy state.
In most cases, if a single UCP component is in a failed state persistently, you should be able to restore the cluster to a healthy state by removing the unhealthy node from the cluster and joining it again.
UCP persists configuration data on an etcd key-value store and RethinkDB database that are replicated on all manager nodes of the UCP cluster. These data stores are for internal use only and should not be used by other applications.
In this example we’ll use curl
for making requests to the key-value
store REST API, and jq
to process the responses.
You can install these tools on a Ubuntu distribution by running:
sudo apt-get update && sudo apt-get install curl jq
Use a client bundle to authenticate your requests.
Use the REST API to access the cluster configurations. The
$DOCKER_HOST
and $DOCKER_CERT_PATH
environment variables are
set when using the client bundle.
export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379"
curl -s \
--cert ${DOCKER_CERT_PATH}/cert.pem \
--key ${DOCKER_CERT_PATH}/key.pem \
--cacert ${DOCKER_CERT_PATH}/ca.pem \
${KV_URL}/v2/keys | jq "."
The containers running the key-value store, include etcdctl
, a
command line client for etcd. You can run it using the docker exec
command.
The examples below assume you are logged in with ssh into a UCP manager node.
docker exec -it ucp-kv etcdctl \
--endpoint https://127.0.0.1:2379 \
--ca-file /etc/docker/ssl/ca.pem \
--cert-file /etc/docker/ssl/cert.pem \
--key-file /etc/docker/ssl/key.pem \
cluster-health
member 16c9ae1872e8b1f0 is healthy: got healthy result from https://192.168.122.64:12379
member c5a24cfdb4263e72 is healthy: got healthy result from https://192.168.122.196:12379
member ca3c1bb18f1b30bf is healthy: got healthy result from https://192.168.122.223:12379
cluster is healthy
On failure, the command exits with an error code and no output.
User and organization data for Docker Enterprise Edition is stored in a RethinkDB database which is replicated across all manager nodes in the UCP cluster.
Replication and failover of this database is typically handled automatically by UCP’s own configuration management processes, but detailed database status and manual reconfiguration of database replication is available through a command line tool available as part of UCP.
The examples below assume you are logged in with ssh into a UCP manager node.
# NODE_ADDRESS will be the IP address of this Docker Swarm manager node
NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
# VERSION will be your most recent version of the docker/ucp-auth image
VERSION=$(docker image ls --format '{{.Tag}}' docker/ucp-auth | head -n 1)
# This command will output detailed status of all servers and database tables
# in the RethinkDB cluster.
docker container run --rm -v ucp-auth-store-certs:/tls docker/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 db-status
Server Status: [
{
"ID": "ffa9cd5a-3370-4ccd-a21f-d7437c90e900",
"Name": "ucp_auth_store_192_168_1_25",
"Network": {
"CanonicalAddresses": [
{
"Host": "192.168.1.25",
"Port": 12384
}
],
"TimeConnected": "2017-07-14T17:21:44.198Z"
}
}
]
...
# NODE_ADDRESS will be the IP address of this Docker Swarm manager node
NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
# NUM_MANAGERS will be the current number of manager nodes in the cluster
NUM_MANAGERS=$(docker node ls --filter role=manager -q | wc -l)
# VERSION will be your most recent version of the docker/ucp-auth image
VERSION=$(docker image ls --format '{{.Tag}}' docker/ucp-auth | head -n 1)
# This reconfigure-db command will repair the RethinkDB cluster to have a
# number of replicas equal to the number of manager nodes in the cluster.
docker container run --rm -v ucp-auth-store-certs:/tls docker/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 --debug reconfigure-db --num-replicas ${NUM_MANAGERS}
time="2017-07-14T20:46:09Z" level=debug msg="Connecting to db ..."
time="2017-07-14T20:46:09Z" level=debug msg="connecting to DB Addrs: [192.168.1.25:12383]"
time="2017-07-14T20:46:09Z" level=debug msg="Reconfiguring number of replicas to 1"
time="2017-07-14T20:46:09Z" level=debug msg="(00/16) Reconfiguring Table Replication..."
time="2017-07-14T20:46:09Z" level=debug msg="(01/16) Reconfigured Replication of Table \"grant_objects\""
...
Loss of Quorum in RethinkDB Tables
When there is loss of quorum in any of the RethinkDB tables, run the
reconfigure-db
command with the --emergency-repair
flag.
Disaster recovery procedures should be performed in the following order:
Swarm is resilient to failures and the swarm can recover from any number of temporary node failures (machine reboots or crash with restart) or other transient errors. However, a swarm cannot automatically recover if it loses a quorum. Tasks on existing worker nodes continue to run, but administrative tasks are not possible, including scaling or updating services and joining or removing nodes from the swarm. The best way to recover is to bring the missing manager nodes back online. If that is not possible, continue reading for some options for recovering your swarm.
In a swarm of N
managers, a quorum (a majority) of manager nodes
must always be available. For example, in a swarm with 5 managers, a
minimum of 3 must be operational and in communication with each other.
In other words, the swarm can tolerate up to (N-1)/2
permanent
failures beyond which requests involving swarm management cannot be
processed. These types of failures include data corruption or hardware
failures.
If you lose the quorum of managers, you cannot administer the swarm. If you have lost the quorum and you attempt to perform any management operation on the swarm, an error occurs:
Error response from daemon: rpc error: code = 4 desc = context deadline exceeded
The best way to recover from losing the quorum is to bring the failed
nodes back online. If you can’t do that, the only way to recover from
this state is to use the --force-new-cluster
action from a manager
node. This removes all managers except the manager the command was run
from. The quorum is achieved because there is now only one manager.
Promote nodes to be managers until you have the desired number of
managers.
# From the node to recover
$ docker swarm init --force-new-cluster --advertise-addr node01:2377
When you run the docker swarm init
command with the
--force-new-cluster
flag, the Docker Engine where you run the
command becomes the manager node of a single-node swarm which is capable
of managing and running services. The manager has all the previous
information about services and tasks, worker nodes are still part of the
swarm, and services are still running. You need to add or re-add manager
nodes to achieve your previous task distribution and ensure that you
have enough managers to maintain high availability and prevent losing
the quorum.
Generally, you do not need to force the swarm to rebalance its tasks. When you add a new node to a swarm, or a node reconnects to the swarm after a period of unavailability, the swarm does not automatically give a workload to the idle node. This is a design decision. If the swarm periodically shifted tasks to different nodes for the sake of balance, the clients using those tasks would be disrupted. The goal is to avoid disrupting running services for the sake of balance across the swarm. When new tasks start, or when a node with running tasks becomes unavailable, those tasks are given to less busy nodes. The goal is eventual balance, with minimal disruption to the end user.
In Docker 1.13 and higher, you can use the --force
or -f
flag
with the docker service update
command to force the service to
redistribute its tasks across the available worker nodes. This causes
the service tasks to restart. Client applications may be disrupted. If
you have configured it, your service uses a rolling
update.
If you use an earlier version and you want to achieve an even balance of
load across workers and don’t mind disrupting running tasks, you can
force your swarm to re-balance by temporarily scaling the service
upward. Use docker service inspect --pretty <servicename>
to see the
configured scale of a service. When you use docker service scale
,
the nodes with the lowest number of tasks are targeted to receive the
new workloads. There may be multiple under-loaded nodes in your swarm.
You may need to scale the service up by modest increments a few times to
achieve the balance you want across all the nodes.
When the load is balanced to your satisfaction, you can scale the
service back down to the original scale. You can use
docker service ps
to assess the current balance of your service
across nodes.
Disaster recovery procedures should be performed in the following order:
In the event half or more manager nodes are lost and cannot be recovered to a healthy state, the system is considered to have lost quorum and can only be restored through the following disaster recovery procedure.
uninstall-ucp
command. > Note: If the restore is happening on
new machines, skip this step.Kubernetes currently backs up the declarative state of Kube objects in etcd. However, for Swarm, there is no way to take the state and export it to a declarative format, since the objects that are embedded within the Swarm raft logs are not easily transferable to other nodes or clusters.
For disaster recovery, to recreate swarm related workloads requires
having the original scripts used for deployment. Alternatively, you can
recreate workloads by manually recreating output from docker inspect
commands.
Docker manager nodes store the swarm state and manager logs in the
/var/lib/docker/swarm/
directory. Swarm raft logs contain crucial
information for re-creating Swarm specific resources, including
services, secrets, configurations and node cryptographic identity. In
1.13 and higher, this data includes the keys used to encrypt the raft
logs. Without these keys, you cannot restore the swarm.
You must perform a manual backup on each manager node, because logs contain node IP address information and are not transferable to other nodes. If you do not backup the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster.
Note
You can avoid performing Swarm backup by storing stacks, services definitions, secrets, and networks definitions in a Source Code Management or Config Management tool.
Data | Description | Bac ked up |
---|---|---|
Raft keys | Used to encrypt communication among Swarm nodes and to encrypt and decrypt Raft logs | yes |
Me mbership | List of the nodes in the cluster | yes |
Services | Stacks and services stored in Swarm-mode | yes |
Networks ( overlay) | The overlay networks created on the cluster | yes |
Configs | The configs created in the cluster | yes |
Secrets | Secrets saved in the cluster | yes |
Swarm unlock key | Must be saved on a password manager ! | no |
Retrieve your Swarm unlock key if auto-lock
is enabled to be able
to restore the swarm from backup. Retrieve the unlock key if
necessary and store it in a safe location.
Because you must stop the engine of the manager node before performing the backup, having three manager nodes is recommended for high availability (HA). For a cluster to be operational, a majority of managers must be online. If less than 3 managers exists, the cluster is unavailable during the backup.
Note
During the time that a manager is shut down, your swarm is more vulnerable to losing the quorum if further nodes are lost. A loss of quorum means that the swarm is unavailabile until quorum is recovered. Quorum is only recovered when more than 50% of the nodes are again available. If you regularly take down managers to do backups, consider running a 5-manager swarm, so that you can lose an additional manager while the backup is running without disrupting services.
Select a manager node. Try not to select the leader in order to avoid a new election inside the cluster:
docker node ls -f "role=manager" | tail -n+2 | grep -vi leader
Optional: Store the Docker version in a variable for easy addition to your backup name.
``ENGINE=$(docker version -f '{{.Server.Version}}')``
Stop the Docker Engine on the manager before backing up the data, so that no data is changed during the backup:
systemctl stop docker
Back up the entire /var/lib/docker/swarm
folder:
tar cvzf "/tmp/swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz" /var/lib/docker/swarm/
Note: You can decode the Unix epoch in the filename by typing ``date -d @timestamp``. For example:
date -d @1531166143
Mon Jul 9 19:55:43 UTC 2018
Restart the manager Docker Engine:
systemctl start docker
Except for step 1, repeat the previous steps for each manager node.
UCP backups no longer require pausing the reconciler and deleting UCP containers, and backing up a UCP manager does not disrupt the manager’s activities.
Because UCP stores the same data on all manager nodes, you only need to back up a single UCP manager node.
User resources, such as services, containers, and stacks are not affected by this operation and continue operating as expected.
Backup contents are stored in a .tar
file. Backups contain UCP
configuration metadata to re-create configurations such as
Administration Settings values such as LDAP and SAML, and RBAC
configurations (Collections, Grants, Roles, User, and more):
Data | Description | Bac ked up |
---|---|---|
Conf igurations | UCP configurations, including Docker Engine - Enterprise license. Swarm, and client CAs | yes |
Access control | Permissions for teams to swarm resources, including collections, grants, and roles | yes |
Ce rtificates and keys | Certificates and public and private keys used for authentication and mutual TLS communication | yes |
Metrics data | Monitoring data gathered by UCP | yes |
Org anizations | Users, teams, and organizations | yes |
Volumes | All UCP named volumes including all UCP component certificates and data. | yes |
Overlay Networks | Swarm-mode overlay network definitions, including port information | no |
Configs, Secrets | Create a Swarm backup to backup these data | no |
Services | Stacks and services are stored in Swarm-mode or SCM/Config Management | no |
Note
Because Kubernetes stores the state of resources on etcd
, a
backup of etcd
is sufficient for stateless backups.
ucp-metrics-data
: holds the metrics server’s data.ucp-node-certs
: holds certs used to lock down UCP system
componentsUCP backups include all Kubernetes declarative objects (pods,
deployments, replicasets, configurations, and so on), including secrets.
These objects are stored in the ucp-kv etcd
database that is backed
up (and restored) as part of UCP backup/restore.
Note
You cannot back up Kubernetes volumes and node labels. Instead, upon restore, Kubernetes declarative objects are re-created. Containers are re-created and IP addresses are resolved.
For more information, see Backing up an etcd cluster.
To avoid directly managing backup files, you can specify a file name and
host directory on a secure and configured storage backend, such as NFS
or another networked file system. The file system location is the backup
folder on the manager node file system. This location must be writable
by the nobody
user, which is specified by changing the folder
ownership to nobody
. This operation requires administrator
permissions to the manager node, and must only be run once for a given
file system location.
sudo chown nobody:nogroup /path/to/folder
Important
Specify a different name for each backup file. Otherwise, the existing backup file with the same name is overwritten. Specify a location that is mounted on a fault-tolerant file system (such as NFS) rather than the node’s local disk. Otherwise, it is important to regularly move backups from the manager node’s local disk to ensure adequate space for ongoing backups.
There are several options for creating a UCP backup:
The backup process runs on one manager node.
The following example shows how to create a UCP manager node backup,
encrypt it by using a passphrase, decrypt it, verify its contents, and
store it locally on the node at /tmp/mybackup.tar
:
Run the
docker/ucp:3.2.5 backup
command on a single UCP manager and include the --file
and
--include-logs
options. This creates a tar archive with the
contents of all volumes used by UCP and
streams it to stdout
. Replace 3.2.5
with the
version you are currently running.
$ docker container run \
--rm \
--log-driver none \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
--volume /tmp:/backup \
docker/ucp:3.2.5 backup \
--file mybackup.tar \
--passphrase "secret12chars" \
--include-logs=false
Note
If you are running with Security-Enhanced Linux (SELinux) enabled,
which is typical for RHEL hosts, you must include
--security-opt label=disable
in the docker
command (replace
version
with the version you are currently running):
$ docker container run \
--rm \
--log-driver none \
--security-opt label=disable \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 backup \
--passphrase "secret12chars" > /tmp/mybackup.tar
Note
To determine whether SELinux is enabled in the engine, view the
host’s /etc/docker/daemon.json
file, and search for the string
"selinux-enabled":"true"
.
To view backup progress and error reporting, view the contents of the
stderr streams of the running backup container during the backup.
Progress is updated for each backup step, for example, after validation,
after volumes are backed up, after etcd
is backed up, and after
rethinkDB
. Progress is not preserved after the backup has completed.
In a valid backup file, 27 or more files are displayed in the list and
the ./ucp-controller-server-certs/key.pem
file is present. Ensure
the backup is a valid tar file by listing its contents, as shown in the
following example:
$ gpg --decrypt /directory1/directory2/backup.tar | tar --list
If decryption is not needed, you can list the contents by removing the
--decrypt flag
, as shown in the following example:
$ tar --list -f /directory1/directory2/backup.tar
To create a UCP backup using the UI:
The UI also provides the following options: - Display the status of a running backup - Display backup history - View backup contents
The UCP API provides three endpoints for managing UCP backups. You must be a UCP administrator to access these API endpoints.
You can create a backup with the POST: /api/ucp/backup
endpoint.
This is a JSON endpoint with the following arguments:
field name | JSON data type* | description |
---|---|---|
passphrase | string | Encryption passphrase |
noPassphrase | bool | Set to true if not using a
passphrase |
fileName | string | Backup file name |
includeLogs | bool | Specifies whether to include a log file |
hostPath | string | File system location |
The request returns one of the following HTTP status codes, and, if successful, a backup ID.
$ curl -sk -H 'Authorization: Bearer $AUTHTOKEN' https://$UCP_HOSTNAME/api/ucp/backup \
-X POST \
-H "Content-Type: application/json" \
--data '{"encrypted": true, "includeLogs": true, "fileName": "backup1.tar", "logFileName": "backup1.log", "hostPath": "/secure-location"}'
200 OK
where:
$AUTHTOKEN
is your authentication bearer token if using auth
token identification.$UCP_HOSTNAME
is your UCP hostname.You can view all existing backups with the GET: /api/ucp/backups
endpoint. This request does not expect a payload and returns a list of
backups, each as a JSON object following the schema found in the Backup
schema section.
The request returns one of the following HTTP status codes and, if successful, a list of existing backups:
curl -sk -H 'Authorization: Bearer $AUTHTOKEN' https://$UCP_HOSTNAME/api/ucp/backups
[
{
"id": "0d0525dd-948a-41b4-9f25-c6b4cd6d9fe4",
"encrypted": true,
"fileName": "backup2.tar",
"logFileName": "backup2.log",
"backupPath": "/secure-location",
"backupState": "SUCCESS",
"nodeLocation": "ucp-node-ubuntu-0",
"shortError": "",
"created_at": "2019-04-10T21:55:53.775Z",
"completed_at": "2019-04-10T21:56:01.184Z"
},
{
"id": "2cf210df-d641-44ca-bc21-bda757c08d18",
"encrypted": true,
"fileName": "backup1.tar",
"logFileName": "backup1.log",
"backupPath": "/secure-location",
"backupState": "IN_PROGRESS",
"nodeLocation": "ucp-node-ubuntu-0",
"shortError": "",
"created_at": "2019-04-10T01:23:59.404Z",
"completed_at": "0001-01-01T00:00:00Z"
}
]
You can retrieve details for a specific backup using the
GET: /api/ucp/backup/{backup_id}
endpoint, where {backup_id}
is
the ID of an existing backup. This request returns the backup, if it
exists, for the specified ID, as a JSON object following the schema
found in the Backup schema section.
The request returns one of the following HTTP status codes, and if successful, the backup for the specified ID:
{backup_id}
The following table describes the backup schema returned by the GET
and LIST
APIs:
field name | JSON data type* | description |
---|---|---|
id | string | Unique ID |
en crypted | boolean | Set to true if encrypted with a passphrase |
f ileName | string | Backup file name if backing up to a file, empty otherwise |
logF ileName | string | Backup log file name if saving backup logs, empty otherwise |
bac kupPath | string | Host path where backup resides |
back upState | string | Current state of the backup (IN_PROGRESS ,
SUCCESS , FAILED ) |
nodeL ocation | string | Node on which the backup was taken |
sho rtError | string | Short error. Empty unless backupState is
set to FAILED |
cre ated_at | string | Time of backup creation |
compl eted_at | string | Time of backup completion |
state.json
in
the zip file.auto-lock
was enabled on the old Swarm, the unlock key is
required to perform the restore.Use the following procedure on each manager node to restore data to a new swarm.
Shut down the Docker Engine on the node you select for the restore:
systemctl stop docker
Remove the contents of the /var/lib/docker/swarm
directory on the
new Swarm if it exists.
Restore the /var/lib/docker/swarm
directory with the contents of
the backup.
Note
The new node uses the same encryption key for on-disk storage as the old one. It is not possible to change the on-disk storage encryption keys at this time. In the case of a swarm with auto-lock enabled, the unlock key is also the same as on the old swarm, and the unlock key is needed to restore the swarm.
Start Docker on the new node. Unlock the swarm if necessary.
systemctl start docker
Re-initialize the swarm so that the node does not attempt to connect to nodes that were part of the old swarm, and presumably no longer exist:
$ docker swarm init --force-new-cluster
Verify that the state of the swarm is as expected. This may include
application-specific tests or simply checking the output of
docker service ls
to be sure that all expected services are
present.
If you use auto-lock, rotate the unlock key.
Add the manager and worker nodes to the new swarm.
Reinstate your previous backup regimen on the new swarm.
To restore UCP, select one of the following options:
docker swarm init
in the same
way as the install operation would. A new swarm is created and UCP is
restored on top.uninstall-ucp
command.docker/ucp
image version) as the backed up cluster.
Restoring to a later patch release version is allowed.During the UCP restore, Kubernetes declarative objects are re-created, containers are re-created, and IPs are resolved.
For more information, see Restoring an etcd cluster.
When the restore operations starts, it looks for the UCP version used in the backup and performs one of the following actions:
- Fails if the restore operation is running using an image that does not match the UCP version from the backup (a `--force` flag is available to override this if necessary)
- Provides instructions how to run the restore process using the matching UCP version from the backup
Volumes are placed onto the host on which the UCP restore command occurs.
The following example shows how to restore UCP from an existing backup
file, presumed to be located at /tmp/backup.tar
(replace
<UCP_VERSION>
with the version of your backup):
$ docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 restore < /tmp/backup.tar
If the backup file is encrypted with a passphrase, provide the
passphrase to the restore operation(replace <UCP_VERSION>
with the
version of your backup):
$ docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 restore --passphrase "secret" < /tmp/backup.tar
The restore command can also be invoked in interactive mode, in which
case the backup file should be mounted to the container rather than
streamed through stdin
:
$ docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
-v /tmp/backup.tar:/config/backup.tar \
docker/ucp:3.2.5 restore -i
The current certs volume containing cluster specific information (such
as SANs) is invalid on new clusters with different IPs. For volumes that
are not backed up (ucp-node-certs
, for example), the restore
regenerates certs. For certs that are backed up,
(ucp-controller-server-certs), the restore does not perform a
regeneration and you must correct those certs when the restore
completes.
After you successfully restore UCP, you can add new managers and workers the same way you would after a fresh installation.
For restore operations, view the output of the restore command.
A successful UCP restore involves verifying the following items:
All swarm managers are healthy after running the following command:
"curl -s -k https://localhost/_ping".
Alternatively, check the UCP UI Nodes page for node status, and monitor the UI for warning banners about unhealthy managers.
Note
Universal Control Plane (UCP), lets you authorize users to view, edit, and use cluster resources by granting role-based permissions against resource sets.
To authorize access to cluster resources across your organization, UCP administrators might take the following high-level steps:
A subject represents a user, team, organization, or a service account. A subject can be granted a role that defines permitted operations against one or more resource sets.
Roles define what operations can be done by whom. A role is a set of permitted operations against a type of resource, like a container or volume, which is assigned to a user or a team with a grant.
For example, the built-in role, Restricted Control, includes
permissions to view and schedule nodes but not to update nodes. A custom
DBA role might include permissions to r-w-x
(read, write, and
execute) volumes and secrets.
Most organizations use multiple roles to fine-tune the appropriate access. A given team or user may have different roles provided to them depending on what resource they are accessing.
To control user access, cluster resources are grouped into Docker Swarm collections or Kubernetes namespaces.
default
namespace for your cluster objects, plus two more
namespaces for system and public resources. You can create custom
namespaces, but unlike Swarm collections, namespaces cannot be nested.
Resource types that users can access in a Kubernetes namespace include pods,
deployments, network policies, nodes, services, secrets, and many more.Together, collections and namespaces are named resource sets.
A grant is made up of a subject, a role, and a resource set.
Grants define which users can access what resources in what way. Grants are effectively Access Control Lists (ACLs) which provide comprehensive access policies for an entire organization when grouped together.
Only an administrator can manage grants, subjects, roles, and access to resources.
Important
An administrator is a user who creates subjects, groups resources by moving them into collections or namespaces, defines roles by selecting allowable operations, and applies grants to users and teams.
For cluster security, only UCP admin users and service accounts that are
granted the cluster-admin
ClusterRole for all Kubernetes namespaces
via a ClusterRoleBinding can deploy pods with privileged options. This
prevents a platform user from being able to bypass the Universal Control
Plane Security Model. These privileged options include:
Pods with any of the following defined in the Pod Specification:
PodSpec.hostIPC
- Prevents a user from deploying a pod in the
host’s IPC Namespace.PodSpec.hostNetwork
- Prevents a user from deploying a pod in the
host’s Network Namespace.PodSpec.hostPID
- Prevents a user from deploying a pod in the
host’s PID Namespace.SecurityContext.allowPrivilegeEscalation
- Prevents a child
process of a container from gaining more privileges than its parent.SecurityContext.capabilities
- Prevents additional Linux
Capabilities from being added to a pod.SecurityContext.privileged
- Prevents a user from deploying a
`Privileged Container.Volume.hostPath
- Prevents a user from mounting a path from the
host into the container. This could be a file, a directory, or even
the Docker Socket.Persistent Volumes using the following storage classes:
Local
- Prevents a user from creating a persistent volume with
the Local Storage Class. The Local storage class allows a user to mount
directorys from the host into a pod. This could be a file, a directory, or
even the Docker Socket.Note
If an Admin has created a persistent volume with the local storage class, a non-admin could consume this via a persistent volume claim.
If a user without a cluster admin role tries to deploy a pod with any of these privileged options, an error similar to the following example is displayed:
Error from server (Forbidden): error when creating "pod.yaml": pods "mypod"
is forbidden: user "<user-id>" is not an admin and does not have permissions
to use privileged mode for resource
Individual users can belong to one or more teams but each team can only be in one organization. At the fictional startup, Acme Company, all teams in the organization are necessarily unique but the user, Alex, is on two teams:
acme-datacenter
├── dba
│ └── Alex*
├── dev
│ └── Bett
└── ops
├── Alex*
└── Chad
All users are authenticated on the backend. Docker EE provides built-in authentication and also integrates with LDAP directory services.
To use Docker EE’s built-in authentication, you must create users manually.
The general flow of designing an organization with teams in UCP is:
To create an organization in UCP:
To create teams in the organization:
New users are assigned a default permission level so that they can access the cluster. To extend a user’s default permissions, add them to a team and create grants. You can optionally grant them Docker EE administrator permissions.
To manually create users in UCP:
Note
A Docker Admin can grant users permission to change the cluster configuration and manage grants, roles, and resource sets.
To enable LDAP in UCP and sync to your LDAP directory:
Yes
by LDAP Enabled. A list of LDAP
settings displays.If Docker EE is configured to sync users with your organization’s LDAP directory server, you can enable syncing the new team’s members when creating a new team or when modifying settings of an existing team.
There are two methods for matching group members from an LDAP directory, direct bind and search bind.
Select Immediately Sync Team Members to run an LDAP sync operation immediately after saving the configuration for the team. It may take a moment before the members of the team are fully synced.
This option specifies that team members should be synced directly with members of a group in your organization’s LDAP directory. The team’s membership will by synced to match the membership of the group.
This option specifies that team members should be synced using a search query against your organization’s LDAP directory. The team’s membership will be synced to match the users in the search results.
A role defines a set of API operations permitted against a resource set. You apply roles to users and teams by creating grants.
Some important rules regarding roles:
You can define custom roles or use the following built-in roles:
Role | Description |
---|---|
None |
Users have no access to Swarm or Kubernetes resources. Maps to No
Access role in UCP 2.1.x. |
View Only |
Users can view resources but can’t create them. |
Restricted Control |
Users can view and edit resources but can’t run a service or container
in a way that affects the node where it’s running. Users cannot mount a
node directory, exec into containers, or run containers in privileged
mode or
with additional kernel capabilities. |
Scheduler |
Users can view nodes (worker and manager) and schedule (not view)
workloads on these nodes. By default, all users are granted the
Scheduler role against the /Shared collection. (To view
workloads, users need permissions such as Container View ). |
Full Control |
Users can view and edit all granted resources. They can create containers without any restriction, but can’t see the containers of other users. |
When creating custom roles to use with Swarm, the Roles page lists all default and custom roles applicable in the organization. To create custom roles for Kubernetes, see Configure native Kubernetes role-based access control.
You can give a role a global name, such as “Remove Images”, which might enable the Remove and Force Remove operations for images. You can apply a role with the same name to different resource sets.
This section describes the set of operations (calls) that can be executed to the Swarm resources. Be aware that each permission corresponds to a CLI command and enables the user to execute that command.
Some important rules regarding roles**:
Docker EE enables access control to cluster resources by grouping resources into resource sets. Combine resource sets with grants to give users permission to access specific cluster resources.
A resource set can be:
A namespace allows you to group resources like Pods, Deployments, Services, or any other Kubernetes-specific resources. You can then enforce RBAC policies and resource quotas for the namespace.
Each Kubernetes resources can only be in one namespace, and namespaces cannot be nested inside one another.
A Swarm collection is a directory of cluster resources like nodes, services, volumes, or other Swarm-specific resources.
Each Swarm resource can only be in one collection at a time, but collections can be nested inside one another, to create hierarchies.
You can nest collections inside one another. If a user is granted
permissions for one collection, they’ll have permissions for its child
collections, pretty much like a directory structure. As of UCP 3.1
,
the ability to create a nested collection of more than 2 layers deep
within the root /Swarm/
collection has been deprecated.
The following image provides two examples of nested collections with the recommended maximum of two nesting layers. The first example illustrates an environment-oriented collection, and the second example illustrates an application-oriented collection.
For a child collection, or for a user who belongs to more than one team, the system concatenates permissions from multiple roles into an “effective role” for the user, which specifies the operations that are allowed against the target.
Docker EE provides a number of built-in collections.
Default Collection | Description |
---|---|
/ |
Path to all resources in the Swarm cluster. Resources not in a collection are put here. |
/System |
Path to UCP managers, DTR nodes, and UCP/DTR system services. By default, only admins have access, but this is configurable. |
/Shared |
Default path to all worker nodes for scheduling. In Docker EE Standard, all worker nodes are located here. In Docker EE Advanced<https://www.docker.com/enterprise-edition>, worker nodes can be moved and [isolated](isolate-nodes.md). |
/Shared/Private |
Path to a user’s private collection. Note that private collections are not created until the user logs in for the first time. |
/Shared/Legacy |
Path to the access control labels of legacy versions (UCP 2.1 and lower). |
Each user has a default collection which can be changed in UCP preferences.
Users can’t deploy a resource without a collection. When a user deploys a resource without an access label, Docker EE automatically places the resource in the user’s default collection.
With Docker Compose, the system applies default collection labels across
all resources in the stack unless com.docker.ucp.access.label
has
been explicitly set.
Default collections and collection labels
Default collections are good for users who work only on a well-defined slice of the system, as well as users who deploy stacks and don’t want to edit the contents of their compose files. A user with more versatile roles in the system, such as an administrator, might find it better to set custom labels for each resource.
Resources are marked as being in a collection by using labels. Some resource types don’t have editable labels, so you can’t move them across collections.
Note
For editable resources, you can change the
com.docker.ucp.access.label
to move resources to different
collections. For example, you may need deploy resources to a collection
other than your default collection.
The system uses the additional labels, com.docker.ucp.collection.*
,
to enable efficient resource lookups. By default, nodes have the
com.docker.ucp.collection.root
,
com.docker.ucp.collection.shared
, and
com.docker.ucp.collection.swarm
labels set to true
. UCP
automatically controls these labels, and you don’t need to manage them.
Collections get generic default names, but you can give them meaningful names, like “Dev”, “Test”, and “Prod”.
A stack is a group of resources identified by a label. You can place
the stack’s resources in multiple collections. Resources are placed in
the user’s default collection unless you specify an explicit
com.docker.ucp.access.label
within the stack/compose file.
Docker EE administrators can create grants to control how users and organizations access resource sets.
A grant defines who has how much access to what resources. Each grant is a 1:1:1 mapping of subject, role, and resource set. For example, you can grant the “Prod Team” “Restricted Control” over services in the “/Production” collection.
A common workflow for creating grants has four steps:
With Kubernetes orchestration, a grant is made up of subject, role, and namespace.
Important
This section assumes that you have created objects for the grant: subject, role, namespace.
To create a Kubernetes grant (role binding) in UCP:
With Swarm orchestration, a grant is made up of subject, role, and collection.
Note
This section assumes that you have created objects to grant: teams/users, roles (built-in or custom), and a collection.
To create a grant in UCP:
Important
By default, all new users are placed in the docker-datacenter
organization. To apply permissions to all Docker EE users,
create a grant with the docker-datacenter
organization as a
subject.
Docker EE administrators can reset user passwords managed in UCP:
User passwords managed with an LDAP service must be changed on the LDAP server.
Administrators who need to update their passwords can ask another administrator for help or SSH into a Docker Enterprise manager node and run:
docker run --net=host -v ucp-auth-api-certs:/tls -it "$(docker inspect --format '{{ .Spec.TaskTemplate.ContainerSpec.Image }}' ucp-auth-api)" "$(docker inspect --format '{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}' ucp-auth-api)" passwd -i
If you have DEBUG set as your global log level within UCP, running
$(docker inspect --format '{{ index .Spec.TaskTemplate.ContainerSpec.Args 0
}}
returns --debug
instead of --db-addr
. Pass Args 1
to
$docker inspect
instead to reset your admin password.
docker run --net=host -v ucp-auth-api-certs:/tls -it "$(docker inspect
--format '{{ .Spec.TaskTemplate.ContainerSpec.Image }}' ucp-auth-api)"
"$(docker inspect --format '{{ index .Spec.TaskTemplate.ContainerSpec.Args 1
}}' ucp-auth-api)" passwd -i
This tutorial explains how to deploy a NGINX web server and limit access to one team with role-based access control (RBAC).
You are the Docker EE system administrator at Acme Company and need to configure permissions to company resources. The best way to do this is to:
Add the organization, acme-datacenter
, and create three teams
according to the following structure:
acme-datacenter
├── dba
│ └── Alex*
├── dev
│ └── Bett
└── ops
├── Alex*
└── Chad
In this section, we deploy NGINX with Kubernetes.
Create a namespace to logically store the NGINX application:
apiVersion: v1
kind: Namespace
metadata:
name: nginx-namespace
For this exercise, create a simple role for the ops team.
Grant the ops team (and only the ops team) access to nginx-namespace with the custom role, Kube Deploy.
acme-datacenter/ops + Kube Deploy + nginx-namespace
You’ve configured Docker EE. The ops
team can now deploy nginx
.
Log on to UCP as “chad” (on the ops
team).
Click Kubernetes > Namespaces.
Paste the following manifest in the terminal window and click Create.
apiVersion: apps/v1beta2 # Use apps/v1beta1 for versions < 1.8.0
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
Log on to UCP as each user and ensure that:
dba
(alex) can’t see nginx-namespace
.dev
(bett) can’t see nginx-namespace
.In this section, we deploy nginx
as a Swarm service. See Kubernetes
Deployment for the same exercise with
Kubernetes.
Create a collection for NGINX resources, nested under the /Shared
collection:
/
├── System
└── Shared
└── nginx-collection
Tip
To drill into a collection, click View Children.
You can use the built-in roles or define your own. For this exercise, create a simple role for the ops team:
Swarm Deploy
.Grant the ops team (and only the ops team) access to
nginx-collection
with the built-in role, Swarm Deploy.
acme-datacenter/ops + Swarm Deploy + /Shared/nginx-collection
You’ve configured Docker Enterprise. The ops
team can now deploy an
nginx
Swarm service.
ops
team).nginx-service
/Shared
in the breadcrumbs.nginx-collection
.dba
(alex) cannot see nginx-collection
.dev
(bett) cannot see nginx-collection
.In this example, two teams are granted access to volumes in two different resource collections. UCP access control prevents the teams from viewing and accessing each other’s volumes, even though they may be located in the same nodes.
Navigate to the Organizations & Teams page to create two teams in the “engineering” organization, named “Dev” and “Prod”. Add a user who’s not a UCP administrator to the Dev team, and add another non-admin user to the Prod team.
In this example, the Dev and Prod teams use two different volumes, which they
access through two corresponding resource collections. The collections are
placed under the /Shared
collection.
In this example, the Dev team gets access to its volumes from a grant that
associates the team with the /Shared/dev-volumes
collection, and the Prod
team gets access to its volumes from another grant that associates the team
with the /Shared/prod-volumes
collection.
Navigate to the Grants page and click Create Grant.
In the left pane, click Collections, and in the Swarm collection, click View Children.
In the Shared collection, click View Children.
In the list, find /Shared/dev-volumes and click Select Collection.
Click Roles, and in the dropdown, select Restricted Control.
Click Subjects, and under Select subject type, click Organizations. In the dropdown, pick the engineering organization, and in the Team dropdown, select Dev.
Click Create to grant permissions to the Dev team.
Click Create Grant and repeat the previous steps for the /Shared/prod-volumes collection and the Prod team.
With the collections and grants in place, users can sign in and create volumes in their assigned collections.
Team members have permission to create volumes in their assigned collection.
Log in as one of the users on the Dev team.
Navigate to the Volumes page to view all of the volumes in the swarm that the user can access.
Click Create volume and name the new volume “dev-data”.
In the left pane, click Collections. The default collection appears. At the top of the page, click Shared, find the dev-volumes collection in the list, and click Select Collection.
Click Create to add the “dev-data” volume to the collection.
Log in as one of the users on the Prod team, and repeat the previous
steps to create a “prod-data” volume assigned to the
/Shared/prod-volumes
collection.
Now you can see role-based access control in action for volumes. The user on the Prod team can’t see the Dev team’s volumes, and if you log in again as a user on the Dev team, you won’t see the Prod team’s volumes.
Sign in with a UCP administrator account, and you see all of the volumes created by the Dev and Prod users.
With Docker EE Advanced, you can enable physical isolation of resources by
organizing nodes into collections and granting Scheduler
access for
different users. To control access to nodes, move them to dedicated collections
where you can grant access to specific users, teams, and organizations.
In this example, a team gets access to a node collection and a resource collection, and UCP access control ensures that the team members cannot view or use swarm resources that aren’t in their collection.
You need a Docker EE Advanced license and at least two worker nodes to complete this example.
To isolate cluster nodes:
Create an Ops
team and assign a user to it.
Create a /Prod
collection for the team’s node.
Assign a worker node to the /Prod
collection.
Grant the Ops
teams access to its collection.
In the web UI, navigate to the Organizations & Teams page to create a team named “Ops” in your organization. Add a user who is not a UCP administrator to the team.
In this example, the Ops team uses an assigned group of nodes, which it accesses through a collection. Also, the team has a separate collection for its resources.
Create two collections: one for the team’s worker nodes and another for the team’s resources.
You’ve created two new collections. The /Prod
collection is for the worker
nodes, and the /Prod/Webserver
sub-collection is for access control to an
application that you’ll deploy on the corresponding worker nodes.
By default, worker nodes are located in the /Shared
collection.
Worker nodes that are running DTR are assigned to the /System
collection. To control access to the team’s nodes, move them to a
dedicated collection.
Move a worker node by changing the value of its access label key,
com.docker.ucp.access.label
, to a different collection.
/System
collection, click another
worker node, because you can’t move nodes that are in the /System
collection. By default, worker nodes are assigned to the /Shared
collection.com.docker.ucp.access.label
and
change its value from /Shared
to /Prod
./Prod
collection.Docker EE Advanced required
If you don’t have a Docker EE Advanced license, you’ll get the following error message when you try to change the access label: Nodes must be in either the shared or system collection without an advanced license.
You need two grants to control access to nodes and container resources:
Ops
team the Restricted Control
role for the
/Prod/Webserver
resources.Ops
team the Scheduler
role against the nodes in
the /Prod
collection.Create two grants for team access to the two collections:
/Prod/Webserver
collection.The same steps apply for the nodes in the /Prod
collection.
Navigate to the Grants page and click Create Grant.
In the left pane, click Collections, and in the Swarm collection, click View Children.
In the Prod collection, click Select Collection.
In the left pane, click Roles, and in the dropdown, select Scheduler.
In the left pane, click Subjects, and under Select subject type, click Organizations.
Select your organization, and in the Team dropdown, select Ops .
Click Create to grant the Ops team Scheduler
access to the
nodes in the /Prod
collection.
The cluster is set up for node isolation. Users with access to nodes in the
/Prod
collection can deploy Swarm services and Kubernetes apps, and their
workloads won’t be scheduled on nodes that aren’t in the collection.
When a user deploys a Swarm service, UCP assigns its resources to the user’s default collection.
From the target collection of a resource, UCP walks up the ancestor
collections until it finds the highest ancestor that the user has
Scheduler
access to. Tasks are scheduled on any nodes in the tree
below this ancestor. In this example, UCP assigns the user’s service to
the /Prod/Webserver
collection and schedules tasks on nodes in the
/Prod
collection.
As a user on the Ops
team, set your default collection to
/Prod/Webserver
.
Ops
team.Deploy a service automatically to worker nodes in the /Prod
collection. All resources are deployed under the user’s default
collection, /Prod/Webserver
, and the containers are scheduled only
on the nodes under /Prod
.
Navigate to the Services page, and click Create Service.
Name the service “NGINX”, use the “nginx:latest” image, and click Create.
When the nginx service status is green, click the service. In the details view, click Inspect Resource, and in the dropdown, select Containers.
Click the NGINX container, and in the details pane, confirm that its Collection is /Prod/Webserver.
Click Inspect Resource, and in the dropdown, select Nodes.
Click the node, and in the details pane, confirm that its Collection is /Prod.
Another approach is to use a grant instead of changing the user’s default
collection. An administrator can create a grant for a role that has the
Service Create
permission against the /Prod/Webserver
collection or a
child collection. In this case, the user sets the value of the service’s access
label, com.docker.ucp.access.label
, to the new collection or one of its
children that has a Service Create
grant for the user.
Starting in Docker Enterprise Edition 2.0, you can deploy a Kubernetes workload to worker nodes, based on a Kubernetes namespace.
To deploy Kubernetes workloads, an administrator must convert a worker node to use the Kubernetes orchestrator.
An administrator must create a Kubernetes namespace to enable node isolation for Kubernetes workloads.
In the left pane, click Kubernetes.
Click Create to open the Create Kubernetes Object page.
In the Object YAML editor, paste the following YAML.
apiVersion: v1
kind: Namespace
metadata:
Name: ops-nodes
Click Create to create the ops-nodes
namespace.
Create a grant to the ops-nodes namespace for the Ops
team by following
the same steps that you used to grant access to the /Prod
collection, only
this time, on the Create Grant page, pick Namespaces, instead of
Collections.
Select the ops-nodes namespace, and create a Full Control
grant for the
Ops
team..
The last step is to link the Kubernetes namespace the /Prod
collection.
Navigate to the Namespaces page, and find the ops-nodes namespace in the list.
Click the More options icon and select Link nodes in collection.
In the Choose collection section, click View children on the Swarm collection to navigate to the Prod collection.
On the Prod collection, click Select collection.
Click Confirm to link the namespace to the collection.
Log in in as a non-admin who’s on the Ops
team.
In the left pane, open the Kubernetes section.
Confirm that ops-nodes is displayed under Namespaces.
Click Create, and in the Object YAML editor, paste the following YAML definition for an NGINX server.
```
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
```
Click Create to deploy the workload.
In the left pane, click Pods and confirm that the workload is running on pods in the ops-nodes namespace.
By default only admin users can pull images into a cluster managed by UCP.
Images are a shared resource, as such they are always in the swarm
collection. To allow users access to pull images, you need to grant them
the image load
permission for the swarm
collection.
As an admin user, go to the UCP web UI, navigate to the Roles
page, and create a new role named Pull images
.
Then go to the Grants page, and create a new grant with:
swarm
collection.Once you click Create the user is able to pull images from the UCP web UI or the CLI.
Collections and grants are strong tools that can be used to control access and visibility to resources in UCP.
This tutorial describes a fictitious company named OrcaBank that needs to configure an architecture in UCP with role-based access control (RBAC) for their application engineering group.
OrcaBank reorganized their application teams by product with each team providing shared services as necessary. Developers at OrcaBank do their own DevOps and deploy and manage the lifecycle of their applications.
OrcaBank has four teams with the following resource needs:
security
should have view-only access to all applications in the
cluster.db
should have full access to all database applications and
resources.mobile
should have full access to their mobile applications and
limited access to shared db
services.payments
should have full access to their payments applications
and limited access to shared db
services.To assign the proper access, OrcaBank is employing a combination of default and custom roles:
View Only
(default role) allows users to see all resources (but
not edit or use).Ops
(custom role) allows users to perform all operations against
configs, containers, images, networks, nodes, secrets, services, and
volumes.View & Use Networks + Secrets
(custom role) enables users to
view/connect to networks and view/use secrets used by db
containers, but prevents them from seeing or impacting the db
applications themselves.OrcaBank is also creating collections of resources to mirror their team structure.
Currently, all OrcaBank applications share the same physical resources,
so all nodes and applications are being configured in collections that
nest under the built-in collection, /Shared
.
Other collections are also being created to enable shared db
applications.
Note
For increased security with node-based isolation, use Docker Enterprise Advanced.
/Shared/mobile
hosts all Mobile applications and resources./Shared/payments
hosts all Payments applications and resources./Shared/db
is a top-level collection for all db
resources./Shared/db/payments
is a collection of db
resources for
Payments applications./Shared/db/mobile
is a collection of db
resources for Mobile
applications.The collection architecture has the following tree representation:
/
├── System
└── Shared
├── mobile
├── payments
└── db
├── mobile
└── payments
OrcaBank’s Grant composition ensures that their collection architecture gives
the db
team access to all db
resources and restricts app teams to
shared db
resources.
OrcaBank has standardized on LDAP for centralized authentication to help their identity team scale across all the platforms they manage.
To implement LDAP authentication in UCP, OrcaBank is using UCP’s native LDAP/AD integration to map LDAP groups directly to UCP teams. Users can be added to or removed from UCP teams via LDAP which can be managed centrally by OrcaBank’s identity team.
The following grant composition shows how LDAP groups are mapped to UCP teams.
OrcaBank is taking advantage of the flexibility in UCP’s grant model by
applying two grants to each application team. One grant allows each team
to fully manage the apps in their own collection, and the second grant
gives them the (limited) access they need to networks and secrets within
the db
collection.
OrcaBank’s resulting access architecture shows applications connecting across collection boundaries. By assigning multiple grants per team, the Mobile and Payments applications teams can connect to dedicated Database resources through a secure and controlled interface, leveraging Database networks and secrets.
Note
In Docker Enterprise Standard, all resources are deployed across the same group of UCP worker nodes. Node segmentation is provided in Docker Enterprise Advanced and discussed in the next tutorial.
The db
team is responsible for deploying and managing the full
lifecycle of the databases used by the application teams. They can
execute the full set of operations against all database resources.
The mobile
team is responsible for deploying their own application
stack, minus the database tier that is managed by the db
team.
Go through the Docker Enterprise Standard tutorial, before continuing here with Docker Enterprise Advanced.
In the first tutorial, the fictional company, OrcaBank, designed an architecture with role-based access control (RBAC) to meet their organization’s security needs. They assigned multiple grants to fine-tune access to resources across collection boundaries on a single platform.
In this tutorial, OrcaBank implements new and more stringent security requirements for production applications:
First, OrcaBank adds staging zone to their deployment model. They will no longer move developed applications directly in to production. Instead, they will deploy apps from their dev cluster to staging for testing, and then to production.
Second, production applications are no longer permitted to share any physical infrastructure with non-production infrastructure. OrcaBank segments the scheduling and access of applications with `Node Access Control.
Note
Node Access Control is a feature of Docker EE and provides secure multi-tenancy with node-based isolation. Nodes can be placed in different collections so that resources can be scheduled and isolated on disparate physical or virtual hardware resources.
OrcaBank still has three application teams, payments
, mobile
,
and db
with varying levels of segmentation between them.
Their RBAC redesign is going to organize their UCP cluster into two top-level collections, staging and production, which are completely separate security zones on separate physical infrastructure.
OrcaBank’s four teams now have different needs in production and staging:
security
should have view-only access to all applications in
production (but not staging).db
should have full access to all database applications and
resources in production (but not staging).mobile
should have full access to their Mobile applications in
both production and staging and limited access to shared db
services.payments
should have full access to their Payments applications
in both production and staging and limited access to shared db
services.OrcaBank has decided to replace their custom Ops
role with the
built-in Full Control
role.
View Only
(default role) allows users to see but not edit all
cluster resources.Full Control
(default role) allows users complete control of all
collections granted to them. They can also create containers without
restriction but cannot see the containers of other users.View & Use Networks + Secrets
(custom role) enables users to
view/connect to networks and view/use secrets used by db
containers, but prevents them from seeing or impacting the db
applications themselves.In the previous tutorial, OrcaBank created separate collections for each
application team and nested them all under /Shared
.
To meet their new security requirements for production, OrcaBank is redesigning collections in two ways:
The collection architecture now has the following tree representation:
/
├── System
├── Shared
├── prod
│ ├── mobile
│ ├── payments
│ └── db
│ ├── mobile
│ └── payments
|
└── staging
├── mobile
└── payments
OrcaBank must now diversify their grants further to ensure the proper division of access.
The payments
and mobile
application teams will have three grants
each–one for deploying to production, one for deploying to staging, and
the same grant to access shared db
networks and secrets.
The resulting access architecture, designed with Docker EE Advanced, provides physical segmentation between production and staging using node access control.
Applications are scheduled only on UCP worker nodes in the dedicated
application collection. And applications use shared resources across
collection boundaries to access the databases in the /prod/db
collection.
The OrcaBank db
team is responsible for deploying and managing the
full lifecycle of the databases that are in production. They have the
full set of operations against all database resources.
The mobile
team is responsible for deploying their full application
stack in staging. In production they deploy their own applications but
use the databases that are provided by the db
team.
With Universal Control Plane you can continue using the tools you know and love like the Docker CLI client and kubectl. You just need to download and use a UCP client bundle.
A client bundle contains a private and public key pair that authorizes your requests in UCP. It also contains utility scripts you can use to configure your Docker and kubectl client tools to talk to your UCP deployment.
Download the Docker CLI client by using the UCP web UI. The web UI ensures that you have the right version of the CLI tools for the current version of UCP.
To use the Docker CLI with UCP, download a client certificate bundle by using the UCP web UI.
Once you’ve downloaded a client certificate bundle to your local computer, you can use it to authenticate your requests.
Navigate to the directory where you downloaded the user bundle, and extract the zip file into a directory. Then use the utility script appropriate for your system:
cd client-bundle && eval "$(<env.sh)"
# Run this from an elevated prompt session
cd client-bundle && env.cmd
The client bundle utility scripts update the environment variables
DOCKER_HOST
to make your client tools communicate with your UCP
deployment, and the DOCKER_CERT_PATH
environment variable to use the
client certificates that are included in the client bundle you
downloaded. The utility scripts also run the kubectl config
command
to configure kubectl.
To confirm that your client tools are now communicating with UCP, run:
docker version --format '{{.Server.Version}}'
kubectl config current-context
The expected Docker server version starts with ucp/
, and the
expected kubectl context name starts with ucp_
.
You can now use the Docker and kubectl clients to create resources in UCP.
In Docker Enterprise 3.0, new files are contained in the UCP bundle.
These changes support the use of .zip
files with
docker context import
and allow you to directly change your context
using the bundle .zip
file. Navigate to the directory where you
downloaded the user bundle and use docker context import
to add the
new context:
cd client-bundle && docker context import myucp ucp-bundle-$USER.zip"
Refer to `Working with
Contexts </engine/context/working-with-contexts/>`__ for more
information on using Docker contexts.
UCP issues different types of certificates depending on the user:
You can also download client bundles by using the UCP REST
API. In this example, we use curl
to
make the web requests to the API, jq
to parse the responses, and
unzip
to unpack the zip archive.
To install these tools on an Ubuntu distribution, you can run:
sudo apt-get update && sudo apt-get install curl jq unzip
Then you get an authentication token from UCP and use it to download the client certificates.
# Create an environment variable with the user security token
AUTHTOKEN=$(curl -sk -d '{"username":"<username>","password":"<password>"}' https://<ucp-ip>/auth/login | jq -r .auth_token)
# Download the client certificate bundle
curl -k -H "Authorization: Bearer $AUTHTOKEN" https://<ucp-ip>/api/clientbundle -o bundle.zip
# Unzip the bundle.
unzip bundle.zip
# Run the utility script.
eval "$(<env.sh)"
# Confirm that you can see UCP containers:
docker ps -af state=running
On Windows Server 2016, open an elevated PowerShell prompt and run:
$AUTHTOKEN=((Invoke-WebRequest -Body '{"username":"<username>", "password":"<password>"}' -Uri https://`<ucp-ip`>/auth/login -Method POST).Content)|ConvertFrom-Json|select auth_token -ExpandProperty auth_token
[io.file]::WriteAllBytes("ucp-bundle.zip", ((Invoke-WebRequest -Uri https://`<ucp-ip`>/api/clientbundle -Headers @{"Authorization"="Bearer $AUTHTOKEN"}).Content))
When using a UCP client bundle and buildkit, follow the instructions provided in Restrict services to worker nodes to make sure that builds are not accidentally scheduled on manager nodes.
For additional information on ‘docker build’ and buildkit, refer to build command documentation and buildkit documentation.
Docker Enterprise 2.0 and higher deploys Kubernetes as part of a UCP installation. Deploy, manage, and monitor Kubernetes workloads from the UCP dashboard. Users can also interact with the Kubernetes deployment through the Kubernetes command-line tool named kubectl.
To access the UCP cluster with kubectl, install the UCP client bundle.
Important
Kubernetes on Docker Desktop for Mac and Docker Desktop for Windows
Docker Desktop for Mac and Docker Desktop for Windows provide a standalone Kubernetes server that runs on your development machine, with kubectl installed by default. This installation is separate from the Kubernetes deployment on a UCP cluster.
To use kubectl, install the binary on a workstation which has access to your UCP endpoint.
Important
Must install compatible version
Kubernetes only guarantees compatibility with kubectl versions that are +/-1 minor versions away from the Kubernetes version.
First, find which version of Kubernetes is running in your cluster. This
can be found within the Universal Control Plane dashboard or at the UCP
API endpoint version. You can also find
the Kubernetes version using the Docker CLI. You need to source a client
bundle and type the docker version
command.
From the UCP dashboard, click About within the Admin menu in the top left corner of the dashboard. Then navigate to Kubernetes.
Once you have the Kubernetes version, install the kubectl client for the relevant operating system.
You can download the binary from this link
If you have curl installed on your system, you use these commands in Powershell.
Docker Enterprise provides users unique certificates and keys to authenticate against the Docker and Kubernetes APIs. Instructions on how to download these certificates and how to configure kubectl to use them can be found in CLI-based access.
Helm is the package manager for Kubernetes. Tiller is the Helm server. Before installing Helm on Docker Enterprise, you must meet the following requirements:
To use Helm and Tiller with UCP, you must grant the default service account within the kube-system namespace the necessary roles. Enter the following kubectl commands in this order:
kubectl create rolebinding default-view --clusterrole=view --serviceaccount=kube-system:default --namespace=kube-system
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
It is recommended that you specify a Role and RoleBinding to limit Tiller’s scope to a particular namespace, as described in Helm’s documentation.
See initialize Helm and install Tiller for more information.
Docker Universal Control Plane allows you to manage your cluster in a visual way, from your browser.
UCP secures your cluster by using role-based access control. From the browser, administrators can:
Non-admin users can only see and change the images, networks, volumes, and containers, and only when they’re granted access by an administrator.
Docker Enterprise 2.1 introduces application packages in Docker. With application packages, you can add metadata and settings to an existing Compose file. This gives operators more context about applications that they deploy and manage.
An application package can have one of these formats:
my-app.dockerapp
folder. This is also
called the folder format.---\n
in
a single file named named my-app.dockerapp
.Once an application package has been deployed, you manipulate and manage it as you would any stack.
To create a stack in the UCP web interface, follow these steps:
Go to the UCP web interface.
In the lefthand menu, first select Shared Resources, then Stacks.
Select Create Stack to display 1. Configure Application in the stack creation dialog.
Enter a name for the stack in the Name field.
Select either Swarm Services or Kubernetes Workloads for the orchestrator mode. If you select Kubernetes, also select a namespace in the Namespace drop-down list.
Select either Compose File or App Package for the Application File Mode.
Select Next.
If you selected Compose file, enter or upload your docker-compose.yml in 2. Add Application File.
or if you selected App Package, enter or upload the application package in the single-file format.
Select Create.
Here is an example of a single-file application package:
version: 0.1.0
name: hello-world
description: "Hello, World!"
namespace: myHubUsername
maintainers:
- name: user
email: "user@email.com"
---
version: "3.6"
services:
hello:
image: hashicorp/http-echo
command: ["-text", "${text}"]
ports:
- ${port}:5678
---
port: 8080
text: Hello, World!
You can deploy and monitor your services from the UCP web UI. In this example,
we’ll deploy an NGINX web server and make it accessible on port 8000
.
To deploy a single service:
In your browser, navigate to the UCP web UI and click Services. The Create a Service page opens.
Click Create Service to configure the NGINX service, and complete the following fields:
Field | Value |
---|---|
Service name | nginx |
Image name | nginx:latest |
In the left pane, click Network.
In the Ports section, click Publish Port and complete the following fields:
Field | Value |
---|---|
Target port | 80 |
Protocol | tcp |
Publish mode | Ingress |
Published port | 8000 |
Click Confirm to map the ports for the NGINX service.
Specify the service image and ports, and click Create to deploy the service into the UCP cluster.
Once the service is up and running, you can view the default NGINX page
by going to http://<node-ip>:8000
. In the Services list, click
the nginx service, and in the details pane, click the link under
Published Endpoints.
Clicking the link opens a new tab that shows the default NGINX home page.
You can also deploy the same service from the CLI. Once you’ve set up your UCP client bundle, enter the following command:
docker service create --name nginx \
--publish mode=ingress,target=80,published=8000 \
--label com.docker.ucp.access.owner=<your-username> \
nginx
Docker Universal Control Plane allows you to use the tools you already
know, like docker stack deploy
to deploy multi-service applications.
You can also deploy your applications from the UCP web UI.
In this example we’ll deploy a multi-service application that allows users to vote on whether they prefer cats or dogs.
version: "3"
services:
# A Redis key-value store to serve as message queue
redis:
image: redis:alpine
ports:
- "6379"
networks:
- frontend
# A PostgreSQL database for persistent storage
db:
image: postgres:9.4
volumes:
- db-data:/var/lib/postgresql/data
networks:
- backend
# Web UI for voting
vote:
image: dockersamples/examplevotingapp_vote:before
ports:
- 5000:80
networks:
- frontend
depends_on:
- redis
# Web UI to count voting results
result:
image: dockersamples/examplevotingapp_result:before
ports:
- 5001:80
networks:
- backend
depends_on:
- db
# Worker service to read from message queue
worker:
image: dockersamples/examplevotingapp_worker
networks:
- frontend
- backend
networks:
frontend:
backend:
volumes:
db-data:
To deploy your applications from the UCP web UI, on the left navigation bar expand Shared resources, choose Stacks, and click Create stack.
Choose the name you want for your stack, and choose Swarm services as the deployment mode.
When you choose this option, UCP deploys your app using the Docker swarm built-in orchestrator. If you choose ‘Basic containers’ as the deployment mode, UCP deploys your app using the classic Swarm orchestrator.
Then copy-paste the application definition in docker-compose.yml format.
Once you’re done click Create to deploy the stack.
To deploy the application from the CLI, start by configuring your Docker CLI using a UCP client bundle.
Then, create a file named docker-stack.yml
with the content of the
yaml above, and run:
Once the multi-service application is deployed, it shows up in the UCP web UI. The ‘Stacks’ page shows that you’ve deployed the voting app.
You can also inspect the individual services of the app you deployed. For that, click the voting_app to open the details pane, open Inspect resources and choose Services, since this app was deployed with the built-in Docker swarm orchestrator.
You can also use the Docker CLI to check the status of your app:
docker stack ps voting_app
Great! The app is deployed so we can cast votes by accessing the service that’s listening on port 5000. You don’t need to know the ports a service listens to. You can click the voting_app_vote service and click on the Published endpoints link.
When deploying applications from the web UI, you can’t reference any external files, no matter if you’re using the built-in swarm orchestrator or classic Swarm. For that reason, the following keywords are not supported:
Also, UCP doesn’t store the stack definition you’ve used to deploy the stack. You can use a version control system for this.
Docker Universal Control Plane enforces role-based access control when you deploy services. By default, you don’t need to do anything, because UCP deploys your services to a default collection, unless you specify another one. You can customize the default collection in your UCP profile page.
UCP defines a collection by its path. For example, a user’s default
collection has the path /Shared/Private/<username>
. To deploy a
service to a collection that you specify, assign the collection’s path
to the access label of the service. The access label is named
com.docker.ucp.access.label
.
When UCP deploys a service, it doesn’t automatically create the collections that correspond with your access labels. An administrator must create these collections and grant users access to them. Deployment fails if UCP can’t find a specified collection or if the user doesn’t have access to it.
Here’s an example of a docker service create
command that deploys a
service to a /Shared/database
collection:
docker service create \
--name redis_2 \
--label com.docker.ucp.access.label="/Shared/database"
redis:3.0.6
You can also specify a target collection for a service in a Compose
file. In the service definition, add a labels:
dictionary, and
assign the collection’s path to the com.docker.ucp.access.label
key.
If you don’t specify access labels in the Compose file, resources are placed in the user’s default collection when the stack is deployed.
You can place a stack’s resources into multiple collections, but most of the time, you won’t need to do this.
Here’s an example of a Compose file that specifies two services,
WordPress and MySQL, and gives them the access label
/Shared/wordpress
:
version: '3.1'
services:
wordpress:
image: wordpress
networks:
- wp
ports:
- 8080:80
environment:
WORDPRESS_DB_PASSWORD: example
deploy:
labels:
com.docker.ucp.access.label: /Shared/wordpress
mysql:
image: mysql:5.7
networks:
- wp
environment:
MYSQL_ROOT_PASSWORD: example
deploy:
labels:
com.docker.ucp.access.label: /Shared/wordpress
networks:
wp:
driver: overlay
labels:
com.docker.ucp.access.label: /Shared/wordpress
To deploy the application:
If the /Shared/wordpress
collection doesn’t exist, or if you don’t
have a grant for accessing it, UCP reports an error.
To confirm that the service deployed to the /Shared/wordpress
collection:
/Shared/wordpress
.Note
By default Docker Stacks will create a default overlay
network for your
stack. It will be attached to each container that is deployed. This works if
you have full control over your Default Collection or are an administrator.
If your administrators have locked down UCP to only allow you access to
specific collections and you manage multiple collections, then it can get
very difficult to manage the networks as well and you might run into
permissions errors. To fix this, you must define a custom network and attach
that to each service. The network must have the same
com.docker.ucp.access.label
Label as your service. If configured
correctly, then your network will correctly be grouped with the other resources in your stack.
When deploying and orchestrating services, you often need to configure them with sensitive information like passwords, TLS certificates, or private keys.
Universal Control Plane allows you to store this sensitive information, also known as secrets, in a secure way. It also gives you role-based access control so that you can control which users can use a secret in their services and which ones can manage the secret.
UCP extends the functionality provided by Docker Engine, so you can continue using the same workflows and tools you already use, like the Docker CLI client.
In this example, we’re going to deploy a WordPress application that’s composed of two services:
Instead of configuring our services to use a plain text password stored in an environment variable, we’re going to create a secret to store the password. When we deploy those services, we’ll attach the secret to them, which creates a file with the password inside the container running the service. Our services will be able to use that file, but no one else will be able to see the plain text password.
To make things simpler, we’re not going to configure the database service to persist data. When the service stops, the data is lost.
In the UCP web UI, open the Swarm section and click Secrets.
Click Create Secret to create a new secret. Once you create the secret you won’t be able to edit it or see the secret data again.
Assign a unique name to the secret and set its value. You can optionally define a permission label so that other users have permission to use this secret. Also note that a service and secret must have the same permission label, or both must have no permission label at all, in order to be used together.
In this example, the secret is named wordpress-password-v1
, to make
it easier to track which version of the password our services are using.
Before creating the MySQL and WordPress services, we need to create the network that they’re going to use to communicate with one another.
Navigate to the Networks page, and create the wordpress-network
with the default settings.
Now create the MySQL service:
This creates a MySQL service that’s attached to the wordpress-network
network and that uses the wordpress-password-v1
secret. By default, this
creates a file with the same name at /run/secrets/<secret-name>
inside the
container running the service.
We also set the MYSQL_ROOT_PASSWORD_FILE
environment variable to
configure MySQL to use the content of the
/run/secrets/wordpress-password-v1
file as the root password.
Now that the MySQL service is running, we can deploy a WordPress service that uses MySQL as a storage backend:
This creates the WordPress service attached to the same network as the MySQL service so that they can communicate, and maps the port 80 of the service to port 8000 of the cluster routing mesh.
Once you deploy this service, you’ll be able to access it using the IP address of any node in your UCP cluster, on port 8000.
If the secret gets compromised, you’ll need to rotate it so that your services start using a new secret. In this case, we need to change the password we’re using and update the MySQL and WordPress services to use the new password.
Since secrets are immutable in the sense that you can’t change the data they store after they are created, we can use the following process to achieve this:
Let’s rotate the secret we’ve created. Navigate to the Secrets page
and create a new secret named wordpress-password-v2
.
This example is simple, and we know which services we need to update, but in the real world, this might not always be the case.
Click the wordpress-password-v1 secret. In the details pane, click Inspect Resource, and in the dropdown, select Services.
Start by updating the wordpress-db
service to stop using the secret
wordpress-password-v1
and use the new version instead.
The MYSQL_ROOT_PASSWORD_FILE
environment variable is currently set
to look for a file at /run/secrets/wordpress-password-v1
which won’t
exist after we update the service. So we have two options:
/run/secrets/wordpress-password-v2
, or/run/secrets/wordpress-password-v2
(the default), we can
customize it to be mounted in/run/secrets/wordpress-password-v1
instead. This way we don’t need to change the environment variable.
This is what we’re going to do.When adding the secret to the services, instead of leaving the Target
Name field with the default value, set it with
wordpress-password-v1
. This will make the file with the content of
wordpress-password-v2
be mounted in
/run/secrets/wordpress-password-v1
.
Delete the wordpress-password-v1
secret, and click Update.
Then do the same thing for the WordPress service. After this is done, the WordPress application is running and using the new password.
Application-layer (Layer 7) routing is the application routing and load balancing (ingress routing) system included with Docker Enterprise for Swarm orchestration. Interlock architecture takes advantage of the underlying Swarm components to provide scalable Layer 7 routing and Layer 4 VIP mode functionality.
Interlock is specific to the Swarm orchestrator. If you’re trying to route traffic to your Kubernetes applications, refer to Cluster ingress for more information.
Interlock uses the Docker Remote API to automatically configure extensions such as NGINX or HAProxy for application traffic. Interlock is designed for:
Docker Engine running in swarm mode has a routing mesh, which makes it easy to expose your services to the outside world. Since all nodes participate in the routing mesh, users can access a service by contacting any node.
For example, a WordPress service is listening on port 8000 of the routing mesh. Even though the service is running on a single node, users can access WordPress using the domain name or IP of any of the nodes that are part of the swarm.
UCP extends this one step further with Layer 7 layer routing (also known as application Layer 7), allowing users to access Docker services using domain names instead of IP addresses. This functionality is made available through the Interlock component.
Using Interlock in the previous example, users can access the WordPress
service using http://wordpress.example.org
. Interlock takes care of
routing traffic to the correct place.
Interlock has three primary services:
Interlock manages both extension and proxy service updates for both configuration changes and application service deployments. There is no intervention from the operator required.
The following image shows the default Interlock configuration, once you enable Layer 7 routing in UCP:
The Interlock service starts a single replica on a manager node. The Interlock-extension service runs a single replica on any available node, and the Interlock-proxy service starts two replicas on any available node.
If you don’t have any worker nodes in your cluster, then all Interlock components run on manager nodes.
Layer 7 routing in UCP supports:
This document covers the following considerations:
A good understanding of this content is necessary for the successful deployment and use of Interlock.
When an application image is updated, the following actions occur:
The service is updated with a new version of the application.
The default “stop-first” policy stops the first replica before scheduling the second. The interlock proxies remove ip1.0 out of the backend pool as the app.1 task is removed.
The first application task is rescheduled with the new image after the first task stops.
The interlock proxy.1 is then rescheduled with the new nginx configuration that contains the update for the new app.1 task.
After proxy.1 is complete, proxy.2 redeploys with the updated ngnix configuration for the app.1 task.
In this scenario, the amount of time that the service is unavailable is less than 30 seconds.
Swarm provides control over the order in which old tasks are removed
while new ones are created. This is controlled on the service-level with
--update-order
.
stop-first
(default)- Configures the currently updating task to
stop before the new task is scheduled.start-first
- Configures the current task to stop after the new
task has scheduled. This guarantees that the new task is running
before the old task has shut down.Use start-first
if …
Use stop-first
if …
In most cases, start-first
is the best choice because it optimizes
for high availability during updates.
Swarm services use update-delay
to control the speed at which a
service is updated. This adds a timed delay between application tasks as
they are updated. The delay controls the time from when the first task
of a service transitions to healthy state and the time that the second
task begins its update. The default is 0 seconds, which means that a
replica task begins updating as soon as the previous updated task
transitions in to a healthy state.
Use update-delay
if …
Do not use update-delay
if …
Swarm uses application health checks extensively to ensure that its
updates do not cause service interruption. health-cmd
can be
configured in a Dockerfile or compose file to define a method for health
checking an application. Without health checks, Swarm cannot determine
when an application is truly ready to service traffic and will mark it
as healthy as soon as the container process is running. This can
potentially send traffic to an application before it is capable of
serving clients, leading to dropped connections.
stop-grace-period
configures a time period for which the task will
continue to run but will not accept new connections. This allows
connections to drain before the task is stopped, reducing the
possibility of terminating requests in-flight. The default value is 10
seconds. This means that a task continues to run for 10 seconds after
starting its shutdown cycle, which also removes it from the load
balancer to prevent it from accepting new connections. Applications that
receive long-lived connections can benefit from longer shut down cycles
so that connections can terminate normally.
Interlock service clusters allow Interlock to be segmented into multiple logical instances called “service clusters”, which have independently managed proxies. Application traffic only uses the proxies for a specific service cluster, allowing the full segmentation of traffic. Each service cluster only connects to the networks using that specific service cluster, which reduces the number of overlay networks to which proxies connect. Because service clusters also deploy separate proxies, this also reduces the amount of churn in LB configs when there are service updates.
Interlock proxy containers connect to the overlay network of every Swarm service. Having many networks connected to Interlock adds incremental delay when Interlock updates its load balancer configuration. Each network connected to Interlock generally adds 1-2 seconds of update delay. With many networks, the Interlock update delay causes the LB config to be out of date for too long, which can cause traffic to be dropped.
Minimizing the number of overlay networks that Interlock connects to can be accomplished in two ways:
VIP Mode can be used to reduce the impact of application updates on the Interlock proxies. It utilizes the Swarm L4 load balancing VIPs instead of individual task IPs to load balance traffic to a more stable internal endpoint. This prevents the proxy LB configs from changing for most kinds of app service updates reducing churn for Interlock. The following features are not supported in VIP mode:
The following features are supported in VIP mode:
This topic covers deploying a layer 7 routing solution into a Docker Swarm to route traffic to Swarm services. Layer 7 routing is also referred to as an HTTP routing mesh (HRM).
By default, layer 7 routing is disabled, so you must first enable this service from the UCP web UI.
By default, the routing mesh service listens on port 8080 for HTTP and port 8443 for HTTPS. Change the ports if you already have services that are using them.
When layer 7 routing is enabled:
ucp-interlock
overlay network.ucp-interlock
service and attaches it both to the
Docker socket and the overlay network that was created. This allows
the Interlock service to use the Docker API. That’s also the reason
why this service needs to run on a manger node.ucp-interlock
service starts the ucp-interlock-extension
service and attaches it to the ucp-interlock
network. This allows
both services to communicate.ucp-interlock-extension
generates a configuration to be used
by the proxy service. By default the proxy service is NGINX, so this
service generates a standard NGINX configuration. UCP creates the
com.docker.ucp.interlock.conf-1
configuration file and uses it to
configure all the internal components of this service.ucp-interlock
service takes the proxy configuration and uses
it to start the ucp-interlock-proxy
service.Now you are ready to use the layer 7 routing service with your Swarm workloads. There are three primary Interlock services: core, extension, and proxy.
The following code sample provides a default UCP configuration. This will be created automatically when enabling Interlock as described in this section.
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
AllowInsecure = false
PollInterval = "3s"
[Extensions]
[Extensions.default]
Image = "docker/ucp-interlock-extension:3.2.5"
ServiceName = "ucp-interlock-extension"
Args = []
Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
ProxyImage = "docker/ucp-interlock-proxy:3.2.5"
ProxyServiceName = "ucp-interlock-proxy"
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyReplicas = 2
ProxyStopSignal = "SIGQUIT"
ProxyStopGracePeriod = "5s"
ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
PublishMode = "ingress"
PublishedPort = 8080
TargetPort = 80
PublishedSSLPort = 8443
TargetSSLPort = 443
[Extensions.default.Labels]
"com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
[Extensions.default.ContainerLabels]
"com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
[Extensions.default.ProxyLabels]
"com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
[Extensions.default.ProxyContainerLabels]
"com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
[Extensions.default.Config]
Version = ""
User = "nginx"
PidPath = "/var/run/proxy.pid"
MaxConnections = 1024
ConnectTimeout = 600
SendTimeout = 600
ReadTimeout = 600
IPHash = false
AdminUser = ""
AdminPass = ""
SSLOpts = ""
SSLDefaultDHParam = 1024
SSLDefaultDHParamPath = ""
SSLVerify = "required"
WorkerProcesses = 1
RLimitNoFile = 65535
SSLCiphers = "HIGH:!aNULL:!MD5"
SSLProtocols = "TLSv1.2"
AccessLogPath = "/dev/stdout"
ErrorLogPath = "/dev/stdout"
MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t '$status $body_bytes_sent \"$http_referer\" '\n\t\t '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t '\"$http_x_forwarded_for\" $request_id $msec $request_time '\n\t\t '$upstream_connect_time $upstream_header_time $upstream_response_time';"
KeepaliveTimeout = "75s"
ClientMaxBodySize = "32m"
ClientBodyBufferSize = "8k"
ClientHeaderBufferSize = "1k"
LargeClientHeaderBuffers = "4 8k"
ClientBodyTimeout = "60s"
UnderscoresInHeaders = false
HideInfoHeaders = false
Interlock can also be enabled from the command line, as described in the following sections.
Interlock uses the TOML file for the core service configuration. The following example utilizes Swarm deployment and recovery features by creating a Docker Config object:
$> cat << EOF | docker config create service.interlock.conf -
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "3s"
[Extensions]
[Extensions.default]
Image = "docker/ucp-interlock-extension:3.2.5"
Args = ["-D"]
ProxyImage = "docker/ucp-interlock-proxy:3.2.5"
ProxyArgs = []
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyReplicas = 1
ProxyStopGracePeriod = "3s"
ServiceCluster = ""
PublishMode = "ingress"
PublishedPort = 8080
TargetPort = 80
PublishedSSLPort = 8443
TargetSSLPort = 443
[Extensions.default.Config]
User = "nginx"
PidPath = "/var/run/proxy.pid"
WorkerProcesses = 1
RlimitNoFile = 65535
MaxConnections = 2048
EOF
oqkvv1asncf6p2axhx41vylgt
Next, create a dedicated network for Interlock and the extensions:
$> docker network create -d overlay interlock
Now you can create the Interlock service. Note the requirement to constrain to a manager. The Interlock core service must have access to a Swarm manager, however the extension and proxy services are recommended to run on workers.
$> docker service create \
--name interlock \
--mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
--network interlock \
--constraint node.role==manager \
--config src=service.interlock.conf,target=/config.toml \
docker/ucp-interlock:3.2.5 -D run -c /config.toml
At this point, there should be three (3) services created: one for the Interlock service, one for the extension service, and one for the proxy service:
$> docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
sjpgq7h621ex ucp-interlock replicated 1/1 docker/ucp-interlock:3.2.5
oxjvqc6gxf91 ucp-interlock-extension replicated 1/1 docker/ucp-interlock-extension:3.2.5
lheajcskcbby ucp-interlock-proxy replicated 1/1 docker/ucp-interlock-proxy:3.2.5 *:80->80/tcp *:443->443/tcp
The Interlock traffic layer is now deployed.
This section includes documentation on configuring Interlock for a
production environment. If you have not yet deployed Interlock, refer to
Deploy a layer 7 routing solution because this information builds
upon the basic deployment. This topic does not cover infrastructure
deployment - it assumes you have a vanilla Swarm cluster
(docker init
and docker swarm join
from the nodes).
The layer 7 solution that ships with UCP is highly available and fault tolerant. It is also designed to work independently of how many nodes you’re managing with UCP.
For a production-grade deployment, you need to perform the following actions:
ucp-interlock
service to deploy proxies using that
constraint.Tuning the default deployment to have two nodes dedicated for running
the two replicas of the ucp-interlock-proxy
service ensures:
Configure the selected nodes as load balancer worker nodes ( for
example, lb-00
and lb-01
) with node labels in order to pin the
Interlock Proxy service. After you log in to one of the Swarm managers,
run the following commands to add node labels to the dedicated ingress
workers:
$> docker node update --label-add nodetype=loadbalancer lb-00
lb-00
$> docker node update --label-add nodetype=loadbalancer lb-01
lb-01
You can inspect each node to ensure the labels were successfully added:
$> docker node inspect -f '{{ .Spec.Labels }}' lb-00
map[nodetype:loadbalancer]
$> docker node inspect -f '{{ .Spec.Labels }}' lb-01
map[nodetype:loadbalancer]
The command should print “loadbalancer”.
Now that your nodes are labeled, you need to update the ucp-interlock-proxy
service configuration to deploy the proxy service with the correct constraints
(constrained to those workers). From a manager, add a constraint to the
ucp-interlock-proxy
service to update the running service:
$> docker service update --replicas=2 \
--constraint-add node.labels.nodetype==loadbalancer \
--stop-signal SIGQUIT \
--stop-grace-period=5s \
$(docker service ls -f 'label=type=com.docker.interlock.core.proxy' -q)
This updates the proxy service to have two (2) replicas and ensure they are
constrained to the workers with the label nodetype==loadbalancer
as well as
configure the stop signal for the tasks to be a SIGQUIT
with a grace period
of five (5) seconds. This will ensure that Nginx uses a graceful shutdown
before exiting to ensure the client request is finished.
Inspect the service to ensure the replicas have started on the desired nodes:
$> docker service ps $(docker service ls -f 'label=type=com.docker.interlock.core.proxy' -q)
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
o21esdruwu30 interlock-proxy.1 nginx:alpine lb-01 Running Preparing 3 seconds ago
n8yed2gp36o6 \_ interlock-proxy.1 nginx:alpine mgr-01 Shutdown Shutdown less than a second ago
aubpjc4cnw79 interlock-proxy.2 nginx:alpine lb-00 Running Preparing 3 seconds ago
Then add the constraint to the ProxyConstraints
array in the
interlock-proxy
service configuration so it takes effect if Interlock is
restored from backup:
[Extensions]
[Extensions.default]
ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.nodetype==loadbalancer"]
By default, the config service is global, scheduling one task on every node in
the cluster, but it will use proxy constraints if available. To add or change
scheduling restraints, update the ProxyConstraints
variable in the
Interlock configuration file. See configure ucp-interlock for more information.
Once reconfigured, you can check if the proxy service is running on the dedicated nodes:
docker service ps ucp-interlock-proxy
Update the settings in the upstream load balancer (ELB, F5, etc) with the addresses of the dedicated ingress workers. This directs all traffic to these nodes.
You have now configured Interlock for a dedicated ingress production environment.
The following example shows the configuration of an eight (8) node Swarm cluster. There are three (3) managers and five (5) workers. Two of the workers are configured with node labels to be dedicated ingress cluster load balancer nodes. These will receive all application traffic. There is also an upstream load balancer (such as an Elastic Load Balancer or F5). The upstream load balancers will be statically configured for the two load balancer worker nodes.
This configuration has several benefits. The management plane is both isolated and redundant. No application traffic hits the managers and application ingress traffic can be routed to the dedicated nodes. These nodes can be configured with higher performance network interfaces to provide more bandwidth for the user services.
To install Interlock on a Docker cluster without internet access, the Docker images must be loaded. This topic describes how to export the images from a local Docker engine and then load them to the Docker Swarm cluster.
First, using an existing Docker engine, save the images:
$> docker save docker/ucp-interlock:3.2.5 > interlock.tar
$> docker save docker/ucp-interlock-extension:3.2.5 > interlock-extension-nginx.tar
$> docker save docker/ucp-interlock-proxy:3.2.5 > interlock-proxy-nginx.tar
Note
Replace
docker/ucp-interlock-extension:3.2.5
and docker/ucp-interlock-proxy:3.2.5
with the corresponding extension and proxy image if you are not using
Nginx.
You should have the following three files:
interlock.tar
: This is the core Interlock application.interlock-extension-nginx.tar
: This is the Interlock extension
for NGINX.interlock-proxy-nginx.tar
: This is the official NGINX image based
on Alpine.Next, copy these files to each node in the Docker Swarm cluster and run the following commands to load each image:
$> docker load < interlock.tar
$> docker load < interlock-extension-nginx.tar
$> docker load < nginx:alpine.tar
The HTTP routing mesh functionality was redesigned in UCP 3.0 for greater security and flexibility. The functionality was also renamed to “layer 7 routing”, to make it easier for new users to get started.
To route traffic to your service you apply specific labels to your swarm services, describing the hostname for the service and other configurations. Things work in the same way as they did with the HTTP routing mesh, with the only difference being that you use different labels.
You don’t have to manually update your services. During the upgrade process to 3.0, UCP updates the services to start using new labels.
This article describes the upgrade process for the routing component, so that you can troubleshoot UCP and your services, in case something goes wrong with the upgrade.
If you are using the HTTP routing mesh, and start an upgrade to UCP 3.0:
com.docker.ucp.interlock.conf-<id>
based on HRM
configurations.ucp-interlock
service is deployed with the configuration created.ucp-interlock
service deploys the ucp-interlock-extension
and
ucp-interlock-proxy-services
.The only way to rollback from an upgrade is by restoring from a backup taken before the upgrade. If something goes wrong during the upgrade process, you need to troubleshoot the interlock services and your services, since the HRM service won’t be running after the upgrade.
After upgrading to UCP 3.0, you should check if all swarm services are still routable.
For services using HTTP:
curl -vs http://<ucp-url>:<hrm-http-port>/ -H "Host: <service-hostname>"
For services using HTTPS:
curl -vs https://<ucp-url>:<hrm-https-port>
After the upgrade, check that you can still use the same hostnames to access the swarm services.
After the upgrade to UCP 3.0, the following services should be running:
ucp-interlock
: monitors swarm workloads configured to use layer 7
routing.ucp-interlock-extension
: Helper service that generates the
configuration for the ucp-interlock-proxy
service.ucp-interlock-proxy
: A service that provides load balancing and
proxying for swarm workloads.To check if these services are running, use a client bundle with administrator permissions and run:
docker ps --filter "name=ucp-interlock"
If the ucp-interlock
service doesn’t exist or is not running,
something went wrong with the reconciliation step.
If this still doesn’t work, it’s possible that UCP is having problems
creating the com.docker.ucp.interlock.conf-1
, due to name
conflicts. Make sure you don’t have any configuration with the same
name by running:
``docker config ls --filter "name=com.docker.ucp.interlock"``
If either the ucp-interlock-extension
or ucp-interlock-proxy
services are not running, it’s possible that there are port conflicts. As a
workaround re-enable the layer 7 routing configuration from the Deploy a
layer 7 routing solution page. Make sure the ports you choose are not
being used by other services.
If you have any of the problems above, disable and enable the layer 7 routing setting on the Deploy a layer 7 routing solution page. This redeploys the services with their default configuration.
When doing that make sure you specify the same ports you were using for HRM, and that no other services are listening on those ports.
You should also check if the ucp-hrm
service is running. If it is, you
should stop it since it can conflict with the ucp-interlock-proxy
service.
As part of the upgrade process UCP adds the labels specific to the new layer 7 routing solution.
You can update your services to remove the old HRM labels, since they won’t be used anymore.
Interlock is designed so that all the control traffic is kept separate from the application traffic.
If before upgrading you had all your applications attached to the ucp-hrm
network, after upgrading you can update your services to start using a
dedicated network for routing that’s not shared with other services.
If before upgrading you had a dedicate network to route traffic to each
service, Interlock will continue using those dedicated networks. However the
ucp-interlock
will be attached to each of those networks. You can update
the ucp-interlock
service so that it is only connected to the ucp-hrm
network.
To further customize the layer 7 routing solution, you must update the
ucp-interlock
service with a new Docker configuration.
Find out what configuration is currently being used for the
ucp-interlock
service and save it to a file:
CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' ucp-interlock)
docker config inspect --format '{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml
Make the necessary changes to the config.toml
file.
Create a new Docker configuration object from the config.toml
file:
NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml
Update the ucp-interlock
service to start using the new configuration:
docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock
By default, the ucp-interlock
service is configured to roll back to
a previous stable configuration if you provide an invalid configuration.
If you want the service to pause instead of rolling back, you can update it with the following command:
docker service update \
--update-failure-action pause \
ucp-interlock
Note
When you enable the layer 7 routing solution from the UCP UI, the
ucp-interlock
service is started using the default configuration.
If you’ve customized the configuration used by the ucp-interlock
service, you must update it again to use the Docker configuration object
you’ve created.
The following sections describe how to configure the primary Interlock services:
The core configuration handles the Interlock service itself. The
following configuration options are available for the ucp-interlock
service.
Option | Type | Description |
---|---|---|
ListenAddr |
string | Address to serve the Interlock GRPC API. Defaults to 8080 . |
DockerURL |
string | Path to the socket or TCP address to the Docker API. Defaults to
unix:// /var/run/docker.sock |
TLSCACert |
string | Path to the CA certificate for connecting securely to the Docker API. |
TLSCert |
string | Path to the certificate for connecting securely to the Docker API. |
TLSKey |
string | Path to the key for connecting securely to the Docker API. |
AllowInsecure |
bool | Skip TLS verification when connecting to the Docker API via TLS. |
PollInterval |
string | Interval to poll the Docker API for changes. Defaults to 3s . |
EndpointOverride |
string | Override the default GRPC API endpoint for extensions. The default is detected via Swarm. |
Extensions |
[]Extension | Array of extensions as listed below |
Interlock must contain at least one extension to service traffic. The following options are available to configure the extensions.
Option | Type | Description |
---|---|---|
Image |
string | Name of the Docker image to use for the extenstion. |
Args |
[]string | Arguments to be passed to the extension service. |
Labels |
map[string]string | Labels to add to the extension service. |
ContainerLabels |
map[string]string | labels for the extension service tasks. |
Constraints |
[]string | One or more constraints to use when scheduling the extenstion service. |
PlacementPreferences |
[]string | One of more placement prefs. |
ServiceName |
string | Name of the extension service. |
ProxyImage |
string | Name of the Docker image to use for the proxy service. |
ProxyArgs |
[]string | Arguments to pass to the proxy service. |
ProxyLabels |
map[string]string | Labels to add to the proxy service. |
ProxyContainerLabels |
map[string]string | Labels to be added to the proxy service tasks. |
ProxyServiceName |
string | Name of the proxy service. |
ProxyConfigPath |
string | Path in the service for the generated proxy config. |
ProxyReplicas |
unit | Number or proxy service replicas. |
ProxyStopSignal |
string | Stop signal for the proxy service, for example SIGQUIT . |
ProxyStopGracePeriod |
string | Stop grace period for the proxy service in seconds, for example 5s . |
ProxyConstraints |
[]string | One or more constraints to use when scheduling the proxy service. Set
the variable to false , as it is currenlty set to true by default. |
ProxyPlacementPreferences |
[]string | One or more placement prefs to use when scheduling the proxy service. |
ProxyUpdateDelay |
string | Delay between rolling proxy container updates. |
ServiceCluster |
string | Name of the cluster this extension services. |
PublishMode |
string (ingress or host ) |
Publish mode that the proxy service uses. |
PublishedPort |
int | Port on which the proxy service serves non-SSL traffic. |
PublishedSSLPort |
int | Port on which the proxy service serves SSL traffic. |
Template |
int | Docker configuration object that is used as the extension template. |
Config |
Config | Proxy configuration used by the extensions as described in this section. |
Options are made available to the extensions, and the extensions utilize the options needed for proxy service configuration. This provides overrides to the extension configuration.
Because Interlock passes the extension configuration directly to the extension, each extension has different configuration options available. Refer to the documentation for each extension for supported options:
The default proxy service used by UCP to provide layer 7 routing is NGINX. If users try to access a route that hasn’t been configured, they will see the default NGINX 404 page:
You can customize this by labeling a service with
com.docker.lb.default_backend=true
. In this case, if users try to
access a route that’s not configured, they are redirected to this
service.
As an example, create a docker-compose.yml
file with:
version: "3.2"
services:
demo:
image: ehazlett/interlock-default-app
deploy:
replicas: 1
labels:
com.docker.lb.default_backend: "true"
com.docker.lb.port: 80
networks:
- demo-network
networks:
demo-network:
driver: overlay
Set up your CLI client with a UCP client bundle, and deploy the service:
docker stack deploy --compose-file docker-compose.yml demo
If users try to access a route that’s not configured, they are directed to this demo service.
The following is an example configuration to use with the NGINX extension.
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "3s"
[Extensions.default]
Image = "docker/interlock-extension-nginx:3.2.5"
Args = ["-D"]
ServiceName = "interlock-ext"
ProxyImage = "docker/ucp-interlock-proxy:3.2.5"
ProxyArgs = []
ProxyServiceName = "interlock-proxy"
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyStopGracePeriod = "3s"
PublishMode = "ingress"
PublishedPort = 80
ProxyReplicas = 1
TargetPort = 80
PublishedSSLPort = 443
TargetSSLPort = 443
[Extensions.default.Config]
User = "nginx"
PidPath = "/var/run/proxy.pid"
WorkerProcesses = 1
RlimitNoFile = 65535
MaxConnections = 2048
By default, layer 7 routing components communicate with one another using overlay networks, but Interlock supports host mode networking in a variety of ways, including proxy only, Interlock only, application only, and hybrid.
When using host mode networking, you cannot use DNS service discovery, since that functionality requires overlay networking. For services to communicate, each service needs to know the IP address of the node where the other service is running.
To use host mode networking instead of overlay networking:
If you have not done so, configure the layer 7 routing solution for
production. The ucp-interlock-proxy
service replicas should then be running
on their own dedicated nodes.
Update the ucp-interlock service configuration so that it uses host mode networking.
Update the PublishMode
key to:
PublishMode = "host"
When updating the ucp-interlock
service to use the new Docker
configuration, make sure to update it so that it starts publishing its
port on the host:
docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
--publish-add mode=host,target=8080 \
ucp-interlock
The ucp-interlock
and ucp-interlock-extension
services are now
communicating using host mode networking.
Now you can deploy your swarm services. Set up your CLI client with a UCP client bundle, and deploy the service. The following example deploys a demo service that also uses host mode networking:
docker service create \
--name demo \
--detach=false \
--label com.docker.lb.hosts=app.example.org \
--label com.docker.lb.port=8080 \
--publish mode=host,target=8080 \
--env METADATA="demo" \
ehazlett/docker-demo
In this example, Docker allocates a high random port on the host where the service can be reached.
To test that everything is working, run the following command:
curl --header "Host: app.example.org" \
http://<proxy-address>:<routing-http-port>/ping
Where:
<proxy-address>
is the domain name or IP address of a node where
the proxy service is running.<routing-http-port>
is the port you’re using to route HTTP
traffic.If everything is working correctly, you should get a JSON result like:
{"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}
The following example describes how to configure an eight (8) node Swarm cluster that uses host mode networking to route traffic without using overlay networks. There are three (3) managers and five (5) workers. Two of the workers are configured with node labels to be dedicated ingress cluster load balancer nodes. These will receive all application traffic.
This example does not cover the actual deployment of infrastructure. It
assumes you have a vanilla Swarm cluster (docker init
and
docker swarm join
from the nodes).
Note
When using host mode networking, you cannot use the DNS service discovery because that requires overlay networking. You can use other tooling, such as Registrator, to get that functionality if needed.
Configure the load balancer worker nodes (lb-00
and lb-01
) with
node labels in order to pin the Interlock Proxy service. Once you are
logged into one of the Swarm managers run the following to add node
labels to the dedicated load balancer worker nodes:
$> docker node update --label-add nodetype=loadbalancer lb-00
lb-00
$> docker node update --label-add nodetype=loadbalancer lb-01
lb-01
Inspect each node to ensure the labels were successfully added:
$> docker node inspect -f '{{ .Spec.Labels }}' lb-00
map[nodetype:loadbalancer]
$> docker node inspect -f '{{ .Spec.Labels }}' lb-01
map[nodetype:loadbalancer]
Next, create a configuration object for Interlock that specifies host mode networking:
$> cat << EOF | docker config create service.interlock.conf -
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "3s"
[Extensions]
[Extensions.default]
Image = "docker/ucp-interlock-extension:3.2.5"
Args = []
ServiceName = "interlock-ext"
ProxyImage = "docker/ucp-interlock-proxy:3.2.5"
ProxyArgs = []
ProxyServiceName = "interlock-proxy"
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyReplicas = 1
PublishMode = "host"
PublishedPort = 80
TargetPort = 80
PublishedSSLPort = 443
TargetSSLPort = 443
[Extensions.default.Config]
User = "nginx"
PidPath = "/var/run/proxy.pid"
WorkerProcesses = 1
RlimitNoFile = 65535
MaxConnections = 2048
EOF
oqkvv1asncf6p2axhx41vylgt
Note
The PublishMode = "host"
setting. This instructs Interlock to configure
the proxy service for host mode networking.
Now create the Interlock service also using host mode networking:
$> docker service create \
--name interlock \
--mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
--constraint node.role==manager \
--publish mode=host,target=8080 \
--config src=service.interlock.conf,target=/config.toml \
{ page.ucp_org }}/ucp-interlock:3.2.5 -D run -c /config.toml
sjpgq7h621exno6svdnsvpv9z
With the node labels, you can re-configure the Interlock Proxy services to be constrained to the workers. From a manager run the following to pin the proxy services to the load balancer worker nodes:
$> docker service update \
--constraint-add node.labels.nodetype==loadbalancer \
interlock-proxy
Now you can deploy the application:
$> docker service create \
--name demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--publish mode=host,target=8080 \
--env METADATA="demo" \
ehazlett/docker-demo
This runs the service using host mode networking. Each task for the service has a high port (for example, 32768) and uses the node IP address to connect. You can see this when inspecting the headers from the request:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
curl -vs -H "Host: demo.local" http://127.0.0.1/ping
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Fri, 10 Nov 2017 15:38:40 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 110
< Connection: keep-alive
< Set-Cookie: session=1510328320174129112; Path=/; Expires=Sat, 11 Nov 2017 15:38:40 GMT; Max-Age=86400
< x-request-id: e4180a8fc6ee15f8d46f11df67c24a7d
< x-proxy-id: d07b29c99f18
< x-server-info: interlock/2.0.0-preview (17476782) linux/amd64
< x-upstream-addr: 172.20.0.4:32768
< x-upstream-response-time: 1510328320.172
<
{"instance":"897d3c7b9e9c","version":"0.1","metadata":"demo","request_id":"e4180a8fc6ee15f8d46f11df67c24a7d"}
By default, nginx is used as a proxy, so the following configuration options are available for the nginx extension:
Option | Type | Description | Defaults |
---|---|---|---|
User |
string | User name for the proxy | nginx |
PidPath |
string | Path to the pid file for the proxy service | /var/run/proxy.pid |
MaxConnections |
int | Maximum number of connections for proxy service | 1024 |
ConnectTimeout |
int | Timeout in seconds for clients to connect | 600 |
SendTimeout |
int | Timeout in seconds for the service to read a response from the proxied upstream | 600 |
ReadTimeout |
int | Timeout in seconds for the service to read a response from the proxied upstream | 600 |
SSLOpts |
int | Options to be passed when configuring SSL | |
SSLDefaultDHParam |
int | Size of DH parameters | 1024 |
SSLDefaultDHParamPath |
string | Path to DH parameters file | |
SSLVerify |
string | SSL client verification | required |
WorkerProcesses |
string | Number of worker processes for the proxy service | 1 |
RLimitNoFile |
int | Number of maxiumum open files for the proxy service | 65535 |
SSLCiphers |
string | SSL ciphers to use for the proxy service | HIGH:!aNULL:!MD5 |
SSLProtocols |
string | Enable the specified TLS protocols | TLSv1.2 |
HideInfoHeaders |
bool | Hide proxy-related response headers | |
KeepaliveTimeout |
string | connection keepalive timeout | 75s |
ClientMaxBodySize |
string | Maximum allowed size of the client request body | 1 m |
ClientBodyBufferSize |
string | sets buffer size for reading client request body | 8k |
ClientHeaderBufferSize |
string | Sets the maximum number and size of buffers used for reading large client request header | 1k |
LargeClientHeaderBuffers |
string | Sets the maximum number and size of buffers used for reading large client request header | 4 8k |
ClientBodyTimeout |
string | Timeout for reading client request body | 60s |
UnderscoresInHeaders |
bool | Enables or disables the use of underscores in client request header fields | false |
ServerNamesHashBucketSize |
int | Sets the bucket size for the server names hash tables (in KB) | 128 |
UpstreamZoneSize |
int | Size of the shared memory zone (in KB) | 64 |
GlobalOptions |
[]string | List of options that are included in the global configuration | |
HTTPOptions |
[]string | List of options that are included in the http configuration | |
TCPOptions |
[]string | List of options that are included in the stream (TCP) configuration | |
AccessLogPath |
string | Path to use for access logs | /dev/stdout |
ErrorLogPath |
string | Path to use for error logs | /dev/stdout |
MainLogFormat |
string | Format to use for main logger | |
TraceLogFormat |
string | Format to use for trace logger |
Refer to Proxy service constraints for information on how to constrain the proxy service to multiple dedicated worker nodes.
To adjust the stop signal and period, use the stop-signal
and
stop-grace-period
settings. For example, to set the stop signal to
SIGTERM
and grace period to ten (10) seconds, use the following
command:
$> docker service update --stop-signal=SIGTERM --stop-grace-period=10s interlock-proxy
In the event of an update failure, the default Swarm action is to
“pause”. This prevents Interlock updates from happening without operator
intervention. You can change this behavior using the
update-failure-action
setting. For example, to automatically
rollback to the previous configuration upon failure, use the following
command:
$> docker service update --update-failure-action=rollback interlock-proxy
By default, Interlock configures the proxy service using rolling update.
For more time between proxy updates, such as to let a service settle,
use the update-delay
setting. For example, if you want to have
thirty (30) seconds between updates, use the following command:
$> docker service update --update-delay=30s interlock-proxy
There are two parts to the update process:
Create the new configuration:
$> docker config create service.interlock.conf.v2 <path-to-new-config>
Remove the old configuration and specify the new configuration:
$> docker service update --config-rm service.interlock.conf ucp-interlock
$> docker service update --config-add source=service.interlock.conf.v2,target=/config.toml ucp-interlock
Next, update the Interlock service to use the new image. To pull the latest version of UCP, run the following:
$> docker pull docker/ucp:latest
latest: Pulling from docker/ucp
cd784148e348: Already exists
3871e7d70c20: Already exists
cad04e4a4815: Pull complete
Digest: sha256:63ca6d3a6c7e94aca60e604b98fccd1295bffd1f69f3d6210031b72fc2467444
Status: Downloaded newer image for docker/ucp:latest
docker.io/docker/ucp:latest
Next, list all the latest UCP images.
$> docker run --rm docker/ucp images --list
docker/ucp-agent:3.2.5
docker/ucp-auth-store:3.2.5
docker/ucp-auth:3.2.5
docker/ucp-azure-ip-allocator:3.2.5
docker/ucp-calico-cni:3.2.5
docker/ucp-calico-kube-controllers:3.2.5
docker/ucp-calico-node:3.2.5
docker/ucp-cfssl:3.2.5
docker/ucp-compose:3.2.5
docker/ucp-controller:3.2.5
docker/ucp-dsinfo:3.2.5
docker/ucp-etcd:3.2.5
docker/ucp-hyperkube:3.2.5
docker/ucp-interlock-extension:3.2.5
docker/ucp-interlock-proxy:3.2.5
docker/ucp-interlock:3.2.5
docker/ucp-kube-compose-api:3.2.5
docker/ucp-kube-compose:3.2.5
docker/ucp-kube-dns-dnsmasq-nanny:3.2.5
docker/ucp-kube-dns-sidecar:3.2.5
docker/ucp-kube-dns:3.2.5
docker/ucp-metrics:3.2.5
docker/ucp-pause:3.2.5
docker/ucp-swarm:3.2.5
docker/ucp:3.2.5
Interlock starts and checks the config object, which has the new extension version, and performs a rolling deploy to update all extensions.
$> docker service update \
--image docker/ucp-interlock:3.2.5 \
ucp-interlock
After Interlock is deployed, you can launch and publish services and applications. Use Service Labels to configure services to publish themselves to the load balancer.
The following examples assume a DNS entry (or local hosts entry if you are testing locally) exists for each of the applications.
Create a Docker Service using two labels:
com.docker.lb.hosts
com.docker.lb.port
The com.docker.lb.hosts
label instructs Interlock where the service
should be available. The com.docker.lb.port
label instructs what
port the proxy service should use to access the upstreams.
Publish a demo service to the host demo.local
:
First, create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, deploy the application:
$> docker service create \
--name demo \
--network demo \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
ehazlett/docker-demo
6r0wiglf5f3bdpcy6zesh1pzx
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application is
available via http://demo.local
.
$> curl -s -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2f1afe673d4","version":"0.1",request_id":"7bcec438af14f8875ffc3deab9215bc5"}
To increase service capacity, use the Docker Service Scale command:
$> docker service scale demo=4
demo scaled to 4
In this example, four service replicas are configured as upstreams. The load balancer balances traffic across all service replicas.
This example deploys a simple service that:
http://app.example.org
.Create a docker-compose.yml
file with:
version: "3.2"
services:
demo:
image: ehazlett/docker-demo
deploy:
replicas: 1
labels:
com.docker.lb.hosts: app.example.org
com.docker.lb.network: demo_demo-network
com.docker.lb.port: 8080
networks:
- demo-network
networks:
demo-network:
driver: overlay
Note that:
external:true
in the docker-compose.yml
file.com.docker.lb.hosts
label defines the hostname for the
service. When the layer 7 routing solution gets a request containing
app.example.org
in the host header, that request is forwarded to
the demo service.com.docker.lb.network
defines which network the
ucp-interlock-proxy
should attach to in order to be able to
communicate with the demo service. To use layer 7 routing, your
services need to be attached to at least one network. If your service
is only attached to a single network, you don’t need to add a label
to specify which network to use for routing. When using a common
stack file for multiple deployments leveraging UCP Interlock / Layer
7 Routing, prefix com.docker.lb.network
with the stack name to
ensure traffic will be directed to the correct overlay network. When
using in combination with com.docker.lb.ssl_passthrough
the label
in mandatory, even if your service is only attached to a single
network.com.docker.lb.port
label specifies which port the
ucp-interlock-proxy
service should use to communicate with this
demo service.Set up your CLI client with a UCP client bundle and deploy the service:
docker stack deploy --compose-file docker-compose.yml demo
The ucp-interlock
service detects that your service is using these labels
and automatically reconfigures the ucp-interlock-proxy
service.
To test that requests are routed to the demo service, run:
curl --header "Host: app.example.org" \
http://<ucp-address>:<routing-http-port>/ping
Where:
<ucp-address>
is the domain name or IP address of a UCP node.<routing-http-port>
is the port you’re using to route HTTP
traffic.If everything is working correctly, you should get a JSON result like:
{"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}
Since the demo service exposes an HTTP endpoint, you can also use your browser to validate that everything is working.
Make sure the /etc/hosts
file in your system has an entry mapping
app.example.org
to the IP address of a UCP node. Once you do that,
you’ll be able to start using the service from your browser.
The following example publishes a service as a canary instance.
First, create an overlay network to isolate and secure service traffic:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, create the initial service:
$> docker service create \
--name demo-v1 \
--network demo \
--detach=false \
--replicas=4 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--env METADATA="demo-version-1" \
ehazlett/docker-demo
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application is
available via http://demo.local
:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to demo.local (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:28:26 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 120
< Connection: keep-alive
< Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
< x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
< x-proxy-id: 5ad7c31f9f00
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.4:8080
< x-upstream-response-time: 1510172906.714
<
{"instance":"df20f55fc943","version":"0.1","metadata":"demo-version-1","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}
Notice metadata
is specified with demo-version-1
.
The following example deploys an updated service as a canary instance:
$> docker service create \
--name demo-v2 \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--env METADATA="demo-version-2" \
--env VERSION="0.2" \
ehazlett/docker-demo
Since this has a replica of one (1), and the initial version has four
(4) replicas, 20% of application traffic is sent to demo-version-2
:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"23d9a5ec47ef","version":"0.1","metadata":"demo-version-1","request_id":"060c609a3ab4b7d9462233488826791c"}
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"f42f7f0a30f9","version":"0.1","metadata":"demo-version-1","request_id":"c848e978e10d4785ac8584347952b963"}
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"1b0d55ed3d2f","version":"0.2","metadata":"demo-version-2","request_id":"b86ff1476842e801bf20a1b5f96cf94e"}
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}
To increase traffic to the new version, add more replicas with
docker scale
:
$> docker service scale demo-v2=4
demo-v2
To complete the upgrade, scale the demo-v1
service to zero (0):
$> docker service scale demo-v1=0
demo-v1
This routes all application traffic to the new version. If you need to rollback, simply scale the v1 service back up and v2 down.
The following example publishes a service using context or path based routing.
First, create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, create the initial service:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.context_root=/app \
--label com.docker.lb.context_root_rewrite=true \
--env METADATA="demo-context-root" \
ehazlett/docker-demo
Only one path per host
Interlock only supports one path per host per service cluster. When a
specific com.docker.lb.hosts
label is applied, it cannot be
applied again in the same service cluster.
Interlock detects when the service is available and publishes it. After the
tasks are running and the proxy service is updated, the application is
available via http://demo.local
:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/app/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /app/ HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Fri, 17 Nov 2017 14:25:17 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< x-request-id: 077d18b67831519defca158e6f009f82
< x-proxy-id: 77c0c37d2c46
< x-server-info: interlock/2.0.0-dev (732c77e7) linux/amd64
< x-upstream-addr: 10.0.1.3:8080
< x-upstream-response-time: 1510928717.306
You can publish services using “vip” and “task” backend routing modes.
Task routing is the default Interlock behavior and the default backend mode if one is not specified. In task routing mode, Interlock uses backend task IPs to route traffic from the proxy to each container. Traffic to the frontend route is L7 load balanced directly to service tasks. This allows for per-container routing functionality such as sticky sessions. Task routing mode applies L7 routing and then sends packets directly to a container.
VIP mode is an alternative mode of routing in which Interlock uses the Swarm service VIP as the backend IP instead of container IPs. Traffic to the frontend route is L7 load balanced to the Swarm service VIP, which L4 load balances to backend tasks. VIP mode can be useful to reduce the amount of churn in Interlock proxy service configuration, which can be an advantage in highly dynamic environments.
VIP mode optimizes for fewer proxy updates in a tradeoff for a reduced feature set. Most application updates do not require configuring backends in VIP mode.
In VIP routing mode Interlock uses the service VIP (a persistent endpoint that exists from service creation to service deletion) as the proxy backend. VIP routing mode was introduced in UCP 3.0 version 3.0.3 and 3.1 version 3.1.2. VIP routing mode applies L7 routing and then sends packets to the Swarm L4 load balancer which routes traffic service containers.
While VIP mode provides endpoint stability in the face of application churn, it cannot support sticky sessions because sticky sessions depend on routing directly to container IPs. Sticky sessions are therefore not supported in VIP mode.
Because VIP mode routes by service IP rather than by task IP it also affects the behavior of canary deployments. In task mode a canary service with one task next to an existing service with four tasks represents one out of five total tasks, so the canary will receive 20% of incoming requests. By contrast the same canary service in VIP mode will receive 50% of incoming requests, because it represents one out of two total services.
You can set the backend mode on a per-service basis, which means that some applications can be deployed in task mode, while others are deployed in VIP mode.
The default backend mode is task
. If a label is set to task
or a label
does not exist, then Interlock uses the task
routing mode.
To use Interlock VIP mode, the following label must be applied:
com.docker.lb.backend_mode=vip
In VIP mode, the following non-exhaustive list of application events does not require proxy reconfiguration:
The following two updates still require a proxy reconfiguration (because these actions create or destroy a service VIP):
The following example publishes a service to be a default host. The service responds whenever there is a request to a host that is not configured.
First, create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, create the initial service:
$> docker service create \
--name demo-default \
--network demo \
--detach=false \
--replicas=1 \
--label com.docker.lb.default_backend=true \
--label com.docker.lb.port=8080 \
ehazlett/interlock-default-app
Interlock detects when the service is available and publishes it. After tasks are running and the proxy service is updated, the application is available via any URL that is not configured:
Create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Create the initial service:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=4 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.backend_mode=vip \
--env METADATA="demo-vip-1" \
ehazlett/docker-demo
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application
should be available via http://demo.local
:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to demo.local (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:28:26 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 120
< Connection: keep-alive
< Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
< x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
< x-proxy-id: 5ad7c31f9f00
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.9:8080
< x-upstream-response-time: 1510172906.714
<
{"instance":"df20f55fc943","version":"0.1","metadata":"demo","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}
Instead of using each task IP for load balancing, configuring VIP mode causes Interlock to use the Virtual IPs of the service instead. Inspecting the service shows the VIPs:
"Endpoint": {
"Spec": {
"Mode": "vip"
},
"VirtualIPs": [
{
"NetworkID": "jed11c1x685a1r8acirk2ylol",
"Addr": "10.0.2.9/24"
}
]
}
In this case, Interlock configures a single upstream for the host using the IP “10.0.2.9”. Interlock skips further proxy updates as long as there is at least 1 replica for the service because the only upstream is the VIP.
Swarm routes requests for the VIP in a round robin fashion at L4. This means that the following Interlock features are incompatible with VIP mode:
After you enable the layer 7 routing solution, you can start using it in your swarm services.
Service labels define hostnames that are routed to the service, the applicable ports, and other routing configurations. Applications that publish using Interlock use service labels to configure how they are published.
When you deploy or update a swarm service with service labels, the following actions occur:
ucp-interlock
service monitors the Docker API for events and
publishes the events to the ucp-interlock-extension
service.ucp-interlock
service takes the new configuration and
reconfigures the ucp-interlock-proxy
to start using the new
configuration.The previous steps occur in milliseconds and with rolling updates. Even though services are being reconfigured, users won’t notice it.
Label | Description | Example |
---|---|---|
com.docker.lb.hosts |
Comma separated list of the hosts that the service should serve. | example.com, test.com |
com.docker.lb.port |
Port to use for internal upstream communication, | 8080 |
com.docker.lb.network |
Name of network the proxy service should attach to for upstream connectivity. | app-network-a |
com.docker.lb.context_root |
Context or path to use for the application. | /app |
com.docker.lb.context_root_rewrite |
When set to true, this option changes the path from the value of label
com.docker.lb.context_root to / . |
true |
com.docker.lb.ssl_cert |
Docker secret to use for the SSL certificate. | example.com.cert |
com.docker.lb.ssl_key |
Docker secret to use for the SSL key. | `example.com.key` |
com.docker.lb.websocket_endpoints |
Comma separated list of endpoints to configure to be upgraded for websockets. | /ws,/foo |
com.docker.lb.service_cluster |
Name of the service cluster to use for the application. | us-east |
com.docker.lb.sticky_session_cookie |
Cookie to use for sticky sessions. | app_session |
com.docker.lb.redirects |
Semi-colon separated list of redirects to add in the format of
<source>, <target> . |
http://old.example.com, http://new.example.com |
com.docker.lb.ssl_passthrough |
Enable SSL passthrough | false |
com.docker.lb.backend_mode |
Select the backend mode that the proxy should use to access the
upstreams. Defaults to task . |
vip |
The following example publishes a service and configures a redirect from
old.local
to new.local
.
Note
There is currently a limitation where redirects do not work if a service is configured for TLS passthrough in Interlock proxy.
First, create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, create the service with the redirect:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=old.local,new.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.redirects=http://old.local,http://new.local \
--env METADATA="demo-new" \
ehazlett/docker-demo
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application is
available via http://new.local
with a redirect configured that sends
http://old.local
to http://new.local
:
$> curl -vs -H "Host: old.local" http://127.0.0.1
* Rebuilt URL to: http://127.0.0.1/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: old.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 19:06:27 GMT
< Content-Type: text/html
< Content-Length: 161
< Connection: keep-alive
< Location: http://new.local/
< x-request-id: c4128318413b589cafb6d9ff8b2aef17
< x-proxy-id: 48854cd435a4
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
<
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.13.6</center>
</body>
</html>
Reconfiguring Interlock’s proxy can take 1-2 seconds per overlay network managed by that proxy. To scale up to larger number of Docker networks and services routed to by Interlock, you may consider implementing service clusters. Service clusters are multiple proxy services managed by Interlock (rather than the default single proxy service), each responsible for routing to a separate set of Docker services and their corresponding networks, thereby minimizing proxy reconfiguration time.
In this example, we’ll assume you have a UCP cluster set up with at least two
worker nodes, ucp-node-0
and ucp-node-1
; we’ll use these as dedicated
proxy servers for two independent Interlock service clusters. We’ll also assume
you’ve already enabled Interlock with an HTTP port of 80 and an HTTPS port of
8443.
First, apply some node labels to the UCP workers you’ve chosen to use as your proxy servers. From a UCP manager:
docker node update --label-add nodetype=loadbalancer --label-add region=east ucp-node-0
docker node update --label-add nodetype=loadbalancer --label-add region=west ucp-node-1
We’ve labeled ucp-node-0
to be the proxy for our east
region, and
ucp-node-1
to be the proxy for our west
region.
Let’s also create a dedicated overlay network for each region’s proxy to manage traffic on. We could create many for each, but bear in mind the cumulative performance hit that incurs:
docker network create --driver overlay eastnet
docker network create --driver overlay westnet
Next, modify Interlock’s configuration to create two service clusters. Start by writing its current configuration out to a file which you can modify:
CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' ucp-interlock)
docker config inspect --format '{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > old_config.toml
Make a new config file called config.toml
with the following content, which
declares two service clusters, east
and west
.
Note
You will have to change the UCP version (3.2.3
in the example below) to
match yours, as well as all instances of *.ucp.InstanceID
(vl5umu06ryluu66uzjcv5h1bo
below):
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
AllowInsecure = false
PollInterval = "3s"
[Extensions]
[Extensions.east]
Image = "docker/ucp-interlock-extension:3.2.3"
ServiceName = "ucp-interlock-extension-east"
Args = []
Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
ConfigImage = "docker/ucp-interlock-config:3.2.3"
ConfigServiceName = "ucp-interlock-config-east"
ProxyImage = "docker/ucp-interlock-proxy:3.2.3"
ProxyServiceName = "ucp-interlock-proxy-east"
ServiceCluster="east"
Networks=["eastnet"]
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyReplicas = 1
ProxyStopSignal = "SIGQUIT"
ProxyStopGracePeriod = "5s"
ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==east"]
PublishMode = "host"
PublishedPort = 80
TargetPort = 80
PublishedSSLPort = 8443
TargetSSLPort = 443
[Extensions.east.Labels]
"ext_region" = "east"
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.east.ContainerLabels]
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.east.ProxyLabels]
"proxy_region" = "east"
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.east.ProxyContainerLabels]
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.east.Config]
Version = ""
HTTPVersion = "1.1"
User = "nginx"
PidPath = "/var/run/proxy.pid"
MaxConnections = 1024
ConnectTimeout = 5
SendTimeout = 600
ReadTimeout = 600
IPHash = false
AdminUser = ""
AdminPass = ""
SSLOpts = ""
SSLDefaultDHParam = 1024
SSLDefaultDHParamPath = ""
SSLVerify = "required"
WorkerProcesses = 1
RLimitNoFile = 65535
SSLCiphers = "HIGH:!aNULL:!MD5"
SSLProtocols = "TLSv1.2"
AccessLogPath = "/dev/stdout"
ErrorLogPath = "/dev/stdout"
MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t '$status $body_bytes_sent \"$http_referer\" '\n\t\t '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t '$upstream_connect_time $upstream_header_time $upstream_response_time';"
KeepaliveTimeout = "75s"
ClientMaxBodySize = "32m"
ClientBodyBufferSize = "8k"
ClientHeaderBufferSize = "1k"
LargeClientHeaderBuffers = "4 8k"
ClientBodyTimeout = "60s"
UnderscoresInHeaders = false
UpstreamZoneSize = 64
ServerNamesHashBucketSize = 128
GlobalOptions = []
HTTPOptions = []
TCPOptions = []
HideInfoHeaders = false
[Extensions.west]
Image = "docker/ucp-interlock-extension:3.2.3"
ServiceName = "ucp-interlock-extension-west"
Args = []
Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
ConfigImage = "docker/ucp-interlock-config:3.2.3"
ConfigServiceName = "ucp-interlock-config-west"
ProxyImage = "docker/ucp-interlock-proxy:3.2.3"
ProxyServiceName = "ucp-interlock-proxy-west"
ServiceCluster="west"
Networks=["westnet"]
ProxyConfigPath = "/etc/nginx/nginx.conf"
ProxyReplicas = 1
ProxyStopSignal = "SIGQUIT"
ProxyStopGracePeriod = "5s"
ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==west"]
PublishMode = "host"
PublishedPort = 80
TargetPort = 80
PublishedSSLPort = 8443
TargetSSLPort = 443
[Extensions.west.Labels]
"ext_region" = "west"
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.west.ContainerLabels]
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.west.ProxyLabels]
"proxy_region" = "west"
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.west.ProxyContainerLabels]
"com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
[Extensions.west.Config]
Version = ""
HTTPVersion = "1.1"
User = "nginx"
PidPath = "/var/run/proxy.pid"
MaxConnections = 1024
ConnectTimeout = 5
SendTimeout = 600
ReadTimeout = 600
IPHash = false
AdminUser = ""
AdminPass = ""
SSLOpts = ""
SSLDefaultDHParam = 1024
SSLDefaultDHParamPath = ""
SSLVerify = "required"
WorkerProcesses = 1
RLimitNoFile = 65535
SSLCiphers = "HIGH:!aNULL:!MD5"
SSLProtocols = "TLSv1.2"
AccessLogPath = "/dev/stdout"
ErrorLogPath = "/dev/stdout"
MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t '$status $body_bytes_sent \"$http_referer\" '\n\t\t '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t '$upstream_connect_time $upstream_header_time $upstream_response_time';"
KeepaliveTimeout = "75s"
ClientMaxBodySize = "32m"
ClientBodyBufferSize = "8k"
ClientHeaderBufferSize = "1k"
LargeClientHeaderBuffers = "4 8k"
ClientBodyTimeout = "60s"
UnderscoresInHeaders = false
UpstreamZoneSize = 64
ServerNamesHashBucketSize = 128
GlobalOptions = []
HTTPOptions = []
TCPOptions = []
HideInfoHeaders = false
If instead you prefer to modify the config file Interlock creates by default, the crucial parts to adjust for a service cluster are:
[Extensions.default]
with [Extensions.east]
ServiceName
to "ucp-interlock-extension-east"
ProxyServiceName
to "ucp-interlock-proxy-east"
"node.labels.region==east"
to the list
ProxyConstraints
ServiceCluster="east"
immediately below and inline
with ProxyServiceName
Networks=["eastnet"]
immediately below and inline
with ServiceCluster
(Note this list can contain as many overlay
networks as you like; Interlock will only connect to the specified
networks, and will connect to them all at startup.)PublishMode="ingress"
to PublishMode="host"
[Extensions.default.Labels]
to
[Extensions.east.Labels]
"ext_region" = "east"
under the
[Extensions.east.Labels]
section[Extensions.default.ContainerLabels]
to
[Extensions.east.ContainerLabels]
[Extensions.default.ProxyLabels]
to
[Extensions.east.ProxyLabels]
"proxy_region" = "east"
under the
[Extensions.east.ProxyLabels]
section[Extensions.default.ProxyContainerLabels]
to
[Extensions.east.ProxyContainerLabels]
[Extensions.default.Config]
to
[Extensions.east.Config]
ProxyReplicas=2
to ProxyReplicas=1
,
necessary only if there is a single node labeled to be a proxy for
each service cluster.[Extensions.east]
block a second time, changing
east
to west
for your west
service cluster.Create a new docker config
object from this configuration file:
NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml
Update the ucp-interlock
service to start using this new
configuration:
docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock
Finally, do a docker service ls
. You should see two services providing
Interlock proxies, ucp-interlock-proxy-east
and -west
. If you only see
one Interlock proxy service, delete it with docker service rm
. After a
moment, the two new proxy services should be created, and Interlock will be
successfully configured with two service clusters.
Now that you’ve set up your service clusters, you can deploy services to be
routed to by each proxy by using the service_cluster
label. Create two
example services:
docker service create --name demoeast \
--network eastnet \
--label com.docker.lb.hosts=demo.A \
--label com.docker.lb.port=8000 \
--label com.docker.lb.service_cluster=east \
training/whoami:latest
docker service create --name demowest \
--network westnet \
--label com.docker.lb.hosts=demo.B \
--label com.docker.lb.port=8000 \
--label com.docker.lb.service_cluster=west \
training/whoami:latest
Recall that ucp-node-0
was your proxy for the east
service cluster.
Attempt to reach your whoami
service there:
curl -H "Host: demo.A" http://<ucp-node-0 public IP>
You should receive a response indicating the container ID of the whoami
container declared by the demoeast
service. Attempt the same curl
at
ucp-node-1
’s IP, and it will fail: the Interlock proxy running there only
routes traffic to services with the service_cluster=west
label, connected
to the westnet
Docker network you listed in that service cluster’s
configuration.
Finally, make sure your second service cluster is working analogously to the first:
curl -H "Host: demo.B" http://<ucp-node-1 public IP>
The service routed by Host: demo.B
is reachable via (and only via) the
Interlock proxy mapped to port 80 on ucp-node-1
. At this point, you have
successfully set up and demonstrated that Interlock can manage multiple proxies
routing only to services attached to a select subset of Docker networks.
You can publish a service and configure the proxy for persistent (sticky) sessions using:
To configure sticky sessions using cookies:
Create an overlay network so that service traffic is isolated and secure, as shown in the following example:
docker network create -d overlay demo 1se1glh749q1i4pw0kf26mfx5
Create a service with the cookie to use for sticky sessions:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=5 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.sticky_session_cookie=session \
--label com.docker.lb.port=8080 \
--env METADATA="demo-sticky" \
ehazlett/docker-demo
Interlock detects when the service is available and publishes it. When tasks
are running and the proxy service is updated, the application is available via
http://demo.local
and is configured to use sticky sessions:
$> curl -vs -c cookie.txt -b cookie.txt -H "Host: demo.local" http://127.0.0.1/ping
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
> Cookie: session=1510171444496686286
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:04:36 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 117
< Connection: keep-alive
* Replaced cookie session="1510171444496686286" for domain demo.local, path /, expire 0
< Set-Cookie: session=1510171444496686286
< x-request-id: 3014728b429320f786728401a83246b8
< x-proxy-id: eae36bf0a3dc
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.5:8080
< x-upstream-response-time: 1510171476.948
<
{"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}
Notice the Set-Cookie
from the application. This is stored by the
curl
command and is sent with subsequent requests, which are pinned
to the same instance. If you make a few requests, you will notice the
same x-upstream-addr
.
The following example shows how to configure sticky sessions using client IP
hashing. This is not as flexible or consistent as cookies but enables
workarounds for some applications that cannot use the other method. When using
IP hashing, reconfigure Interlock proxy to use host mode networking, because
the default ingress
networking mode uses SNAT, which obscures client IP
addresses.
Create an overlay network so that service traffic is isolated and secure:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Create a service with the cookie to use for sticky sessions using IP hashing:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=5 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.ip_hash=true \
--env METADATA="demo-sticky" \
ehazlett/docker-demo
Interlock detects when the service is available and publishes it. When
tasks are running and the proxy service is updated, the application is
available via http://demo.local
and is configured to use sticky
sessions:
$> curl -vs -H "Host: demo.local" http://127.0.0.1/ping
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:04:36 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 117
< Connection: keep-alive
< x-request-id: 3014728b429320f786728401a83246b8
< x-proxy-id: eae36bf0a3dc
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.5:8080
< x-upstream-response-time: 1510171476.948
<
{"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}
You can use docker service scale demo=10
to add more replicas. When
scaled, requests are pinned to a specific backend.
Note
Due to the way the IP hashing works for extensions, you will notice a new upstream address when scaling replicas. This is expected, because internally the proxy uses the new set of replicas to determine a backend on which to pin. When the upstreams are determined, a new “sticky” backend is chosen as the dedicated upstream.
After deploying a layer 7 routing solution, you have two options for securing your services with TLS:
Regardless of the option selected to secure swarm services, there are two steps required to route traffic with TLS:
The following example deploys a swarm service and lets the proxy service handle the TLS connection. All traffic between the proxy and the swarm service is not secured, so use this option only if you trust that no one can monitor traffic inside services running in your datacenter.
Start by getting a private key and certificate for the TLS connection. Make sure the Common Name in the certificate matches the name where your service is going to be available.
You can generate a self-signed certificate for app.example.org
by
running:
openssl req \
-new \
-newkey rsa:4096 \
-days 3650 \
-nodes \
-x509 \
-subj "/C=US/ST=CA/L=SF/O=Docker-demo/CN=app.example.org" \
-keyout app.example.org.key \
-out app.example.org.cert
Then, create a docker-compose.yml file with the following content:
version: "3.2"
services:
demo:
image: ehazlett/docker-demo
deploy:
replicas: 1
labels:
com.docker.lb.hosts: app.example.org
com.docker.lb.network: demo-network
com.docker.lb.port: 8080
com.docker.lb.ssl_cert: demo_app.example.org.cert
com.docker.lb.ssl_key: demo_app.example.org.key
environment:
METADATA: proxy-handles-tls
networks:
- demo-network
networks:
demo-network:
driver: overlay
secrets:
app.example.org.cert:
file: ./app.example.org.cert
app.example.org.key:
file: ./app.example.org.key
Notice that the demo service has labels specifying that the proxy service
should route app.example.org
traffic to this service. All traffic between
the service and proxy takes place using the demo-network
network. The
service also has labels specifying the Docker secrets to use on the proxy
service for terminating the TLS connection.
Because the private key and certificate are stored as Docker secrets, you can easily scale the number of replicas used for running the proxy service. Docker distributes the secrets to the replicas.
Set up your CLI client with a UCP client bundle and deploy the service:
docker stack deploy --compose-file docker-compose.yml demo
The service is now running. To test that everything is working correctly,
update your /etc/hosts
file to map app.example.org
to the IP address of
a UCP node.
In a production deployment, you must create a DNS entry so that users can access the service using the domain name of your choice. After creating the DNS entry, you can access your service:
https://<hostname>:<https-port>
For this example:
hostname
is the name you specified with the
com.docker.lb.hosts
label.https-port
is the port you configured in the UCP settings.Because this example uses self-sign certificates, client tools like browsers display a warning that the connection is insecure.
You can also test from the CLI:
curl --insecure \
--resolve <hostname>:<https-port>:<ucp-ip-address> \
https://<hostname>:<https-port>/ping
If everything is properly configured, you should get a JSON payload:
{"instance":"f537436efb04","version":"0.1","request_id":"5a6a0488b20a73801aa89940b6f8c5d2"}
Because the proxy uses SNI to decide where to route traffic, make sure you are
using a version of curl
that includes the SNI header with insecure
requests. Otherwise, curl
displays an error saying that the SSL handshake
was aborted.
Note
Currently there is no way to update expired certificates using this method. The proper way is to create a new secret then update the corresponding service.
The second option for securing with TLS involves encrypting traffic from end users to your swarm service.
To do that, deploy your swarm service using the following
docker-compose.yml
file:
version: "3.2"
services:
demo:
image: ehazlett/docker-demo
command: --tls-cert=/run/secrets/cert.pem --tls-key=/run/secrets/key.pem
deploy:
replicas: 1
labels:
com.docker.lb.hosts: app.example.org
com.docker.lb.network: demo-network
com.docker.lb.port: 8080
com.docker.lb.ssl_passthrough: "true"
environment:
METADATA: end-to-end-TLS
networks:
- demo-network
secrets:
- source: app.example.org.cert
target: /run/secrets/cert.pem
- source: app.example.org.key
target: /run/secrets/key.pem
networks:
demo-network:
driver: overlay
secrets:
app.example.org.cert:
file: ./app.example.org.cert
app.example.org.key:
file: ./app.example.org.key
The service is updated to start using the secrets with the private key and
certificate. The service is also labeled with com.docker.lb.ssl_passthrough:
true
, signaling UCP to configure the proxy service such that TLS traffic for
app.example.org
is passed to the service.
Since the connection is fully encrypted from end-to-end, the proxy service cannot add metadata such as version information or request ID to the response headers.
First, create an overlay network to isolate and secure service traffic:
$> docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
Next, create the service with websocket endpoints:
$> docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.websocket_endpoints=/ws \
ehazlett/websocket-chat
Note
For websockets to work, you must have an entry for demo.local
in your local hosts (i.e., /etc/hosts
) file. This uses the browser
for websocket communication, so you must have an entry or use a
routable domain.
Interlock detects when the service is available and publishes it. Once
tasks are running and the proxy service is updated, the application
should be available via http://demo.local
. Open two instances of
your browser and text should be displayed on both instances as you type.
The following diagram shows which Kubernetes resources are visible from the UCP web interface.
You can use the UCP web UI to deploy your Kubernetes YAML files. In most cases, modifications are not necessary to deploy on a cluster managed by Docker Enterprise.
In this example, a simple Kubernetes Deployment object for an NGINX server is defined in a YAML file.
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
This YAML file specifies an earlier version of NGINX, which will be updated in a later section.
To create the YAML file:
The UCP web UI shows the status of your deployment when you click the links in the Kubernetes section of the left pane.
The NGINX server is up and running, but it’s not accessible from outside
of the cluster. Create a YAML file to add a NodePort
service to expose
the server on a specified port.
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
nodePort: 32768
selector:
app: nginx
The service connects the cluster’s internal port 80 to the external port 32768.
To expose the server:
Repeat the previous steps and copy-paste the YAML file that defines the
nginx
service into the Object YAML editor on the Create
Kubernetes Object page. When you click Create, the Load
Balancers page opens.
Click the nginx service, and in the details pane, find the Ports section.
Click the link that’s labeled URL to view the default NGINX page.
The YAML definition connects the service to the NGINX server using
the app label nginx
and a corresponding label selector.
Update an existing deployment by applying an updated YAML file. In this example, the server is scaled up to four replicas and updated to a later version of NGINX.
...
spec:
progressDeadlineSeconds: 600
replicas: 4
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: nginx
spec:
containers:
- image: nginx:1.8
...
From the left pane, click Controllers and select nginx-deployment.
In the details pane, click Configure, and in the Edit Deployment page, find the replicas: 2 entry.
Change the number of replicas to 4, so the line reads replicas: 4.
Find the image: nginx:1.7.9 entry and change it to image: nginx:1.8.
Click Save to update the deployment with the new YAML file.
From the the left pane, click Pods to view the newly created replicas.
With Docker Enterprise, you deploy your Kubernetes objects on the command line
using kubectl
.
Use a client bundle to configure your client tools, like Docker CLI and
kubectl
to communicate with UCP instead of the local deployments you
might have running.
When you have the client bundle set up, you can deploy a Kubernetes object from the YAML file.
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
nodePort: 32768
selector:
app: nginx
Save the previous YAML file to a file named “deployment.yaml”, and use the following command to deploy the NGINX server:
kubectl apply -f deployment.yaml
Use the describe deployment
option to inspect the deployment:
kubectl describe deployment nginx-deployment
Also, you can use the UCP web UI to see the deployment’s pods and controllers.
Update an existing deployment by applying an updated YAML file.
Edit deployment.yaml and change the following lines:
Save the edited YAML file to a file named “update.yaml”, and use the following command to deploy the NGINX server:
kubectl apply -f update.yaml
Check that the deployment was scaled out by listing the deployments in the cluster:
kubectl get deployments
You should see four pods in the deployment:
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 4 4 4 4 2d
Check that the pods are running the updated image:
kubectl describe deployment nginx-deployment | grep -i image
You should see the currently running image:
Image: nginx:1.8
Docker Enterprise enables deploying Docker Compose files to Kubernetes
clusters. Starting in Compose file version 3.3, you use the same
docker-compose.yml
file that you use for Swarm deployments, but you
specify Kubernetes workloads when you deploy the stack. The result
is a true Kubernetes app.
To deploy a stack to Kubernetes, you need a namespace for the app’s
resources. Contact your Docker EE administrator to get access to a
namespace. In this example, the namespace is called labs
.
In this example, you create a simple app, named “lab-words”, by using a Compose file. This assumes you are deploying onto a cloud infrastructure. The following YAML defines the stack:
version: '3.3'
services:
web:
image: dockersamples/k8s-wordsmith-web
ports:
- "8080:80"
words:
image: dockersamples/k8s-wordsmith-api
deploy:
replicas: 5
db:
image: dockersamples/k8s-wordsmith-db
In your browser, log in to https://<ucp-url>
. Navigate to
Shared Resources > Stacks.
Click Create Stack to open up the “Create Application” page.
Under “Configure Application”, type “lab-words” for the application name.
Select Kubernetes Workloads for Orchestrator Mode.
In the Namespace drowdown, select “labs”.
Under “Application File Mode”, leave Compose File selected and click Next.
Paste the previous YAML, then click Create to deploy the stack.
After a few minutes have passed, all of the pods in the lab-words
deployment are running.
To inspect the deployment:
Navigate to Kubernetes > Pods. Confirm that there are seven pods and that their status is Running. If any pod has a status of Pending, wait until every pod is running.
Next, select Kubernetes > Load balancers and find the web-published service.
Click the web-published service, and scroll down to the Ports section.
Under Ports, grab the Node Port information.
In a new tab or window, enter your cloud instance public IP Address
and append :<NodePort>
from the previous step. For example, to
find the public IP address of an EC2 instance, refer to Amazon EC2
Instance IP
Addressing.
The app is displayed.
Kubernetes enables access control for workloads by providing service
accounts. A service account represents an identity for processes that
run in a pod. When a process is authenticated through a service account,
it can contact the API server and access cluster resources. If a pod
doesn’t have an assigned service account, it gets the default
service account.
In Docker Enterprise, you give a service account access to cluster resources by creating a grant, the same way that you would give access to a user or a team.
In this example, you will create a service account and a grant that could be used for an NGINX server.
A Kubernetes user account is global, but a service account is scoped to a namespace, so you need to create a namespace before you create a service account.
Navigate to the Namespaces page and click Create.
In the Object YAML editor, append the following text.
metadata:
name: nginx
Click Create.
In the nginx namespace, click the More options icon, and in the context menu, select Set Context, and click Confirm.
Click the Set context for all namespaces toggle and click Confirm.
Create a service account named nginx-service-account
in the
nginx
namespace.
To give the service account access to cluster resources, create a grant
with Restricted Control
permissions.
Navigate to the Grants page and click Create Grant.
In the left pane, click Resource Sets, and in the Type section, click Namespaces.
Select the nginx namespace.
In the left pane, click Roles. In the Role dropdown, select Restricted Control.
In the left pane, click Subjects, and select Service Account.
Important
The Service Account option in the Subject Type section appears only when a Kubernetes namespace is present.
In the Namespace dropdown, select nginx, and in the Service Account dropdown, select nginx-service-account.
Click Create.
Now nginx-service-account
has access to all cluster resources that
are assigned to the nginx
namespace.
For UCP, Calico provides the secure networking functionality for container-to-container communication within Kubernetes. UCP handles the lifecycle of Calico and packages it with UCP installation and upgrade. Additionally, the Calico deployment included with UCP is fully supported with Docker providing guidance on the container network interface (CNI) components.
At install time, UCP can be configured to install an alternative CNI plugin to support alternative use cases. The alternative CNI plugin may be certified by Docker and its partners, and published on Docker Hub. UCP components are still fully supported by Docker and respective partners. Docker will provide pointers to basic configuration, however for additional guidance on managing third-party CNI components, the platform operator will need to refer to the partner documentation or contact that third party.
UCP does manage the version or configuration of alternative CNI plugins. UCP upgrade will not upgrade or reconfigure alternative CNI plugins. To switch between managed and unmanaged CNI plugins or vice versa, you must uninstall and then reinstall UCP.
Once a platform operator has complied with UCP system
requirements and taken
into consideration any requirements for the custom CNI plugin, you can
run the UCP install command with
the --unmanaged-cni
flag to bring up the platform.
This command will install UCP, and bring up components like the user
interface and the RBAC engine. UCP components that require Kubernetes
Networking, such as Metrics, will not start and will stay in a
Container Creating
state in Kubernetes until a CNI is installed.
Once connected to a manager node with Docker Enterprise
installed, you are ready to install UCP with the --unmanaged-cni
flag.
docker container run --rm -it --name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.2.5 install \
--host-address <node-ip-address> \
--unmanaged-cni \
--interactive
Once the installation is complete, you can access UCP from a web browser. Note
that the manager node will be unhealthy as the kubelet
will report NetworkPluginNotReady
. Additionally, the metrics in the
UCP dashboard will also be unavailable, as this runs in a Kubernetes
pod.
Next, a platform operator should log in to UCP, download a UCP client
bundle, and configure the Kubernetes CLI tool, kubectl
.
With kubectl
, you can see that the UCP components running on
Kubernetes are still pending, waiting for a CNI driver before becoming
available.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
manager-01 NotReady master 10m v1.11.9-docker-1
$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
compose-565f7cf9ff-gq2gv 0/1 Pending 0 10m <none> <none> <none>
compose-api-574d64f46f-r4c5g 0/1 Pending 0 10m <none> <none> <none>
kube-dns-6d96c4d9c6-8jzv7 0/3 Pending 0 10m <none> <none> <none>
ucp-metrics-nwt2z 0/3 ContainerCreating 0 10m <none> manager-01 <none>
You can usekubectl
to install a custom CNI plugin on UCP.
Alternative CNI plugins are Weave, Flannel, Canal, Romana and many more.
Platform operators have complete flexibility on what to install, but
Docker will not support the CNI plugin.
The steps for installing a CNI plugin typically include:
/opt/cni/bin
.$ kubectl apply -f <your-custom-cni-plugin>.yaml
Follow the CNI plugin documentation for specific installation instructions.
Note
While troubleshooting a custom CNI plugin, you may wish to access
logs within the kubelet. Connect to a UCP manager node and run
$ docker logs ucp-kubelet
.
Upon successful installation of the CNI plugin, the related UCP
components should have a Running
status as pods start to become
available.
$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
compose-565f7cf9ff-gq2gv 1/1 Running 0 21m 10.32.0.2 manager-01 <none>
compose-api-574d64f46f-r4c5g 1/1 Running 0 21m 10.32.0.3 manager-01 <none>
kube-dns-6d96c4d9c6-8jzv7 3/3 Running 0 22m 10.32.0.5 manager-01 <none>
ucp-metrics-nwt2z 3/3 Running 0 22m 10.32.0.4 manager-01 <none>
weave-net-wgvcd 2/2 Running 0 8m 172.31.6.95 manager-01 <none>
Note
The above example deployment uses Weave. If you are using an alternative CNI plugin, look for the relevant name and review its status.
Docker Enterprise provides data-plane level IPSec network encryption to securely encrypt application traffic in a Kubernetes cluster. This secures application traffic within a cluster when running in untrusted infrastructure or environments. It is an optional feature of UCP that is enabled by deploying the SecureOverlay components on Kubernetes when using the default Calico driver for networking configured for IPIP tunneling (the default configuration).
Kubernetes network encryption is enabled by two components in UCP:
The agent is deployed as a per-node service that manages the encryption state of the data plane. The agent controls the IPSec encryption on Calico’s IPIP tunnel traffic between different nodes in the Kubernetes cluster. The master is the second component, deployed on a UCP manager node, which acts as the key management process that configures and periodically rotates the encryption keys.
Kubernetes network encryption uses AES Galois Counter Mode (AES-GCM) with 128-bit keys by default. Encryption is not enabled by default and requires the SecureOverlay Agent and Master to be deployed on UCP to begin encrypting traffic within the cluster. It can be enabled or disabled at any time during the cluster lifecycle. However, it should be noted that it can cause temporary traffic outages between pods during the first few minutes of traffic enabling/disabling. When enabled, Kubernetes pod traffic between hosts is encrypted at the IPIP tunnel interface in the UCP host.
Kubernetes network encryption is supported for the following platforms:
Before deploying the SecureOverlay components, ensure that Calico is configured so that the IPIP tunnel MTU maximum transmission unit (MTU), or the largest packet length that the container will allow, leaves sufficient headroom for the encryption overhead. Encryption adds 26 bytes of overhead, but every IPSec packet size must be a multiple of 4 bytes. IPIP tunnels require 20 bytes of encapsulation overhead. The IPIP tunnel interface MTU must be no more than “EXTMTU - 46 - ((EXTMTU - 46) modulo 4)”, where EXTMTU is the minimum MTU of the external interfaces. An IPIP MTU of 1452 should generally be safe for most deployments.
Changing UCP’s MTU requires updating the UCP configuration.
Update the following values to the new MTU:
[cluster_config]
...
calico_mtu = "1452"
ipip_mtu = "1452"
...
SecureOverlay allows you to enable IPSec network encryption in Kubernetes. Once the cluster nodes’ MTUs are properly configured, deploy the SecureOverlay components using the SecureOverlay YAML file to UCP.
Beginning with UCP 3.2.4, you can configure SecureOverlay in one of two ways:
secure-overlay
to the UCP configuration file.Download the SecureOverlay YAML file.
Issue the following command from any machine with the properly configured kubectl environment and the proper UCP bundle’s credentials:
$ kubectl apply -f ucp-secureoverlay.yml
Run this command at cluster installation time before starting any workloads.
To remove the encryption from the system, issue the following command:
$ kubectl delete -f ucp-secureoverlay.yml
Users can provide persistent storage for workloads running on Docker Enterprise by using NFS storage. These NFS shares, when mounted into the running container, provide state to the application, managing data external to the container’s lifecycle.
Note
Provisioning an NFS server and exporting an NFS share are out of scope for this guide. Additionally, using external Kubernetes plugins to dynamically provision NFS shares, is also out of scope for this guide.
There are two options to mount existing NFS shares within Kubernetes Pods:
Platform operators can provide persistent storage for workloads running on Docker Enterprise and Microsoft Azure by using Azure Disk. Platform operators can either pre-provision Azure Disks to be consumed by Kubernetes Pods, or can use the Azure Kubernetes integration to dynamically provision Azure Disks on demand.
This guide assumes you have already provisioned a UCP environment on Microsoft Azure. The Cluster must be provisioned after meeting all of the prerequisites listed in Install UCP on Azure.
Additionally, this guide uses the Kubernetes Command Line tool
$ kubectl
to provision Kubernetes objects within a UCP cluster.
Therefore, this tool must be downloaded, along with a UCP client bundle.
An operator can use existing Azure Disks or manually provision new ones to provide persistent storage for Kubernetes Pods. Azure Disks can be manually provisioned in the Azure Portal, using ARM Templates or the Azure CLI. The following example uses the Azure CLI to manually provision an Azure Disk.
$ RG=myresourcegroup
$ az disk create \
--resource-group $RG \
--name k8s_volume_1 \
--size-gb 20 \
--query id \
--output tsv
Using the Azure CLI command in the previous example should return the Azure ID of the Azure Disk Object. If you are provisioning Azure resources using an alternative method, make sure you retrieve the Azure ID of the Azure Disk, because it is needed for another step.
/subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>
You can now create Kubernetes Objects that refer to this Azure Disk. The following example uses a Kubernetes Pod. However, the same Azure Disk syntax can be used for DaemonSets, Deployments, and StatefulSets. In the following example, the Azure Disk Name and ID reflect the manually created Azure Disk.
$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: mypod-azuredisk
spec:
containers:
- image: nginx
name: mypod
volumeMounts:
- name: mystorage
mountPath: /data
volumes:
- name: mystorage
azureDisk:
kind: Managed
diskName: k8s_volume_1
diskURI: /subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>
EOF
Kubernetes can dynamically provision Azure Disks using the Azure Kubernetes integration, which was configured when UCP was installed. For Kubernetes to determine which APIs to use when provisioning storage, you must create Kubernetes Storage Classes specific to each storage backend.
In Azure, there are two different Azure Disk types that can be consumed by Kubernetes: Azure Disk Standard Volumes and Azure Disk Premium Volumes.
Depending on your use case, you can deploy one or both of the Azure Disk storage Classes (Standard and Advanced).
To create a Standard Storage Class:
$ cat <<EOF | kubectl create -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
provisioner: kubernetes.io/azure-disk
parameters:
storageaccounttype: Standard_LRS
kind: Managed
EOF
To Create a Premium Storage Class:
$ cat <<EOF | kubectl create -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: premium
provisioner: kubernetes.io/azure-disk
parameters:
storageaccounttype: Premium_LRS
kind: Managed
EOF
To determine which Storage Classes have been provisioned:
$ kubectl get storageclasses
NAME PROVISIONER AGE
premium kubernetes.io/azure-disk 1m
standard kubernetes.io/azure-disk 1m
After you create a Storage Class, you can use Kubernetes Objects to dynamically provision Azure Disks. This is done using Kubernetes Persistent Volumes Claims.
The following example uses the standard storage class and creates a 5 GiB Azure Disk. Alter these values to fit your use case.
$ cat <<EOF | kubectl create -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: azure-disk-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
EOF
At this point, you should see a new Persistent Volume Claim and Persistent Volume inside of Kubernetes. You should also see a new Azure Disk created in the Azure Portal.
$ kubectl get persistentvolumeclaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
azure-disk-pvc Bound pvc-587deeb6-6ad6-11e9-9509-0242ac11000b 5Gi RWO standard 1m
$ kubectl get persistentvolume
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-587deeb6-6ad6-11e9-9509-0242ac11000b 5Gi RWO Delete Bound default/azure-disk-pvc standard 3m
Now that a Kubernetes Persistent Volume has been created, you can mount this into a Kubernetes Pod. The disk can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example just mounts the persistent volume into a standalone pod.
$ cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
name: mypod-dynamic-azuredisk
spec:
containers:
- name: mypod
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: azure-disk-pvc
EOF
In Azure, there are limits to the number of data disks that can be attached to each Virtual Machine. This data is shown in Azure Virtual Machine Sizes. Kubernetes is aware of these restrictions, and prevents pods from deploying on Nodes that have reached their maximum Azure Disk Capacity.
This can be seen if a pod is stuck in the ContainerCreating
stage:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
mypod-azure-disk 0/1 ContainerCreating 0 4m
Describing the pod displays troubleshooting logs, showing the node has reached its capacity:
$ kubectl describe pods mypod-azure-disk
<...>
Warning FailedAttachVolume 7s (x11 over 6m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" : Attach volume "kubernetes-dynamic-pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" to instance "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Compute/virtualMachines/worker-03" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=409 -- Original Error: failed request: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="The maximum number of data disks allowed to be attached to a VM of this size is 4." Target="dataDisks"
Platform operators can provide persistent storage for workloads running on Docker Enterprise and Microsoft Azure by using Azure Files. You can either pre-provision Azure Files Shares to be consumed by Kubernetes Pods or can you use the Azure Kubernetes integration to dynamically provision Azure Files Shares on demand.
This guide assumes you have already provisioned a UCP environment on Microsoft Azure. The cluster must be provisioned after meeting all prerequisites listed in Install UCP on Azure.
Additionally, this guide uses the Kubernetes Command Line tool
$ kubectl
to provision Kubernetes objects within a UCP cluster.
Therefore, you must download this tool along with a UCP client bundle.
You can use existing Azure Files Shares or manually provision new ones to provide persistent storage for Kubernetes Pods. Azure Files Shares can be manually provisioned in the Azure Portal using ARM Templates or using the Azure CLI. The following example uses the Azure CLI to manually provision Azure Files Shares.
When manually creating an Azure Files Share, first create an Azure Storage Account for the file shares. If you have already provisioned a Storage Account, you can skip to “Creating an Azure Files Share.”
Note
The Azure Kubernetes Driver does not support Azure Storage Accounts created using Azure Premium Storage.
$ REGION=ukwest
$ SA=mystorageaccount
$ RG=myresourcegroup
$ az storage account create \
--name $SA \
--resource-group $RG \
--location $REGION \
--sku Standard_LRS
After a File Share has been created, you must load the Azure Storage Account Access key as a Kubernetes Secret into UCP. This provides access to the file share when Kubernetes attempts to mount the share into a pod. This key can be found in the Azure Portal or retrieved as shown in the following example by the Azure CLI.
$ SA=mystorageaccount
$ RG=myresourcegroup
$ FS=myfileshare
# The Azure Storage Account Access Key can also be found in the Azure Portal
$ STORAGE_KEY=$(az storage account keys list --resource-group $RG --account-name $SA --query "[0].value" -o tsv)
$ kubectl create secret generic azure-secret \
--from-literal=azurestorageaccountname=$SA \
--from-literal=azurestorageaccountkey=$STORAGE_KEY
AWS Elastic Block Store (EBS) can be deployed with Kubernetes in Docker Enterprise 2.1 to use AWS volumes as peristent storage for applications. Before using EBS volumes, configure UCP and the AWS infrastructure for storage orchestration to function.
Kubernetes Cloud Providers rovide a method of provisioning cloud resources
through Kubernetes via he --cloud-provider
option. In AWS, this flag
allows the provisioning of EBS volumes and cloud load balancers.
Configuring a cluster for AWS requires several specific configuration parameters in the infrastructure before installing UCP.
Instances must have the following AWS Identity and Access permissions configured to provision EBS volumes through Kubernetes PVCs.
Master | Worker |
---|---|
ec2:DescribeInstances | ec2:DescribeInstances |
ec2:AttachVolume | ec2:AttachVolume |
ec2:DetachVolume | ec2:DetachVolume |
ec2:DescribeVolumes | ec2:DescribeVolumes |
ec2:CreateVolume | ec2:DescribeSecurityGroups |
ec2:DeleteVolume | |
ec2:CreateTags | |
ec2:DescribeSecurityGroups |
KubernetesCluster
and assign the
same value across all nodes, for example, UCPKubenertesCluster
.--cloud-provider=aws
is required at install time.ucp-agent
needs to be updated to propogate the new configuration.[cluster_config]
...
cloud_provider = "aws"
After configuring UCP for the AWS cloud provider, you can create persistent volumes that deploy EBS volumes attached to hosts and mounted inside pods. The EBS volumes are provisioned dynamically such they are created, attached, destroyed along with the lifecycle of the persistent volumes. This does not require users to directly access to the AWS as you request these resources directly through Kubernetes primitives.
We recommend you use the StorageClass
and PersistentVolumeClaim
resources as these abstraction layers provide more portability as well
as control over the storage layer across environments.
A StorageClass
lets administrators describe “classes” of storage
available in which classes map to quality-of-service levels, or backup
policies, or any policies required by cluster administrators. The
following StorageClass
maps a “standard” class of storage to the
gp2
type of storage in AWS EBS.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
A PersistentVolumeClaim
(PVC) is a claim for storage resources that
are bound to a PersistentVolume
(PV) when storage resources are
granted. The following PVC makes a request for 1Gi
of storage from
the standard
storage class.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: task-pv-claim
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
The following Pod spec references the PVC task-pv-claim
from above
which references the standard
storage class in this cluster.
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
Once the pod is deployed, run the following kubectl
command to
verify the PV was created and bound to the PVC.
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-751c006e-a00b-11e8-8007-0242ac110012 1Gi RWO Retain Bound default/task-pv-claim standard 3h
The AWS console shows a volume has been provisioned having a
matching name with type gp2
and a 1GiB
size.
This image has commands to install and manage UCP on a Docker Engine.
You can configure the commands using flags or environment variables.
When using environment variables, use the docker container
run -e VARIABLE_NAME
syntax to pass the value from your shell, or
docker container run -e VARIABLE_NAME=value
to specify the value
explicitly on the command line.
The container running this image needs to be named ucp and bind-mount the Docker daemon socket. Below you can find an example of how to run this image.
Additional help is available for each command with the --help
flag.
docker container run -it --rm \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
command [command arguments]
Use this command to create a backup of a UCP manager node.
docker container run \
--rm \
--interactive \
--name ucp \
--log-driver none \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
backup [command options] > backup.tar
This command creates a tar file with the contents of the volumes used
by this UCP manager node, and prints it. You can then use the restore
command to restore the data from an existing backup.
To create backups of a multi-node cluster, you only need to back up a single manager node. The restore operation will reconstitute a new UCP installation from the backup of any previous manager.
Note
The backup contains private keys and other sensitive information. Use
the --passphrase
flag to encrypt the backup with PGP-compatible
encryption or --no-passphrase
to opt out (not recommended).
If using the --file
option, the path to the file must be bind
mounted onto the container that is performing the backup, and the
filepath must be relative to the container’s file tree. For example:
docker run <other options> --mount
type=bind,src=/home/user/backup:/backup docker/ucp --file
/backup/backup.tar
If you are installing UCP on a manager node with SELinunx enabled at the daemon and operating system level, you will need to pass ` –security-opt label=disable` in to your install command. This flag will disable SELinux policies on the installation container. The UCP installation container mounts and configures the Docker Socket as part of the UCP installation container, therefore the UCP installation will fail with a permission denied error if you fail to pass in this flag.
FATA[0000] unable to get valid Docker client: unable to ping Docker
daemon: Got permission denied while trying to connect to the Docker
daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/_ping:
dial unix /var/run/docker.sock: connect: permission denied -
If SELinux is enabled on the Docker daemon, make sure you run
UCP with "docker run --security-opt label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."
An installation command for a system with SELinux enabled at the daemon level would be:
docker container run \
--rm \
--interactive \
--name ucp \
--security-opt label=disable \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
backup [command options] > backup.tar
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--file *value* |
Name of the file to write the backup contents to. Ignored in interactive mode. |
--jsonlog |
Produce json formatted output for easier parsing. |
--include-logs |
Only relevant if --file is also included. If true , an encrypted
backup.log file will be stored alongside the backup.tar in the
mounted directory. Default is true . |
--interactive, -i |
Run in interactive mode and prompt for configuration values. |
--no-passphrase |
Opt out to encrypt the tar file with a passphrase (not recommended). |
--passphrase value |
Encrypt the tar file with a passphrase |
Use this command to print the public certificates used by this UCP web server.
This command outputs the public certificates for the UCP web server running on this node. By default, it prints the contents of the ca.pem and cert.pem files.
When integrating UCP and DTR, use this command with the --cluster --ca
flags to configure DTR.
docker container run --rm \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
dump-certs [command options]
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--ca |
Only print the contents of the ca.pem file. |
--cluster |
Print the internal UCP swarm root CA and cert instead of the public server cert. |
Use this command to display an example configuration file for UCP.
docker container run --rm -i \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
example-config
Use this command to print the ID of the UCP components running on this node.
This ID matches what you see when running the docker info
command while
using a client bundle. This ID is used by other commands as confirmation.
docker container run --rm \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
id
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
Use this command to verify the UCP images on this node. This command checks the UCP images that are available in this node, and pulls the ones that are missing.
docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
images [command options]
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--list |
List all images used by UCP but don’t pull them. |
--pull value |
Pull UCP images: always , when missing , or never . |
--registry-password value |
Password to use when pulling images. |
--registry-username value |
Username to use when pulling images. |
Use this command to install UCP on a node. Running this command will initialize a new swarm, turn a node into a manager, and install UCP.
When installing UCP, you can customize:
The UCP web server certificates. Create a volume named
ucp-controller-server-certs
and copy the ca.pem
, cert.pem
, and
key.pem
files to the root directory. Next, run the install command with
the --external-server-cert
flag.
The license used by UCP, which you can accomplish by bind-mounting the
file at /config/docker_subscription.lic
in the tool or by specifying
the --license $(cat license.lic)
option.
For example, to bind-mount the file:
-v /path/to/my/config/docker_subscription.lic:/config/docker_subscription.lic
If you’re joining more nodes to this swarm, open the following ports in your firewall:
--controller-port
--swarm-port
If you are installing UCP on a manager node with SELinunx enabled at the daemon
and OS level, you will need to pass --security-opt label=disable
in to your
install command. This flag will disable SELinux policies on the installation
container. The UCP installation container mounts and configures the Docker
Socket as part of the UCP installation container, therefore the UCP
installation will fail with the following permission denied error if you fail
to pass in this flag.
FATA[0000] unable to get valid Docker client: unable to ping Docker daemon: Got
permission denied while trying to connect to the Docker daemon socket at
unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/_ping: dial
unix /var/run/docker.sock: connect: permission denied - If SELinux is enabled
on the Docker daemon, make sure you run UCP with "docker run --security-opt
label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."
An installation command for a system with SELinux enabled at the daemon level would be:
docker container run \
--rm \
--interactive \
--tty \
--name ucp \
--security-opt label=disable \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
install [command options]
If you are installing on a public cloud platform, there is cloud specific UCP installation documentation:
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--interactive, -i |
Run in interactive mode and prompt for configuration values. |
--admin-password value |
The UCP administrator password [$UCP_ADMIN_PASSWORD]. |
--admin-username value |
The UCP administrator username [$UCP_ADMIN_USER]. |
--azure-ip-count value |
Configure the Number of IP Address to be provisioned for each Azure Virtual Machine (default: “128”). |
binpack |
Set the Docker Swarm scheduler to binpack mode. Used for backwards compatibility. |
--cloud-provider value |
The cloud provider for the cluster. |
--cni-installer-url value |
A URL pointing to a kubernetes YAML file to be used as an installer for the CNI plugin of the cluster. If specified, the default CNI plugin will not be installed. If the URL is using the HTTPS scheme, no certificate verification will be performed |
--controller-port value |
Port for the web UI and API (default: 443). |
--data-path-addr value |
Address or interface to use for data path traffic. Format: IP address or network interface name [$UCP_DATA_PATH_ADDR]. |
--disable-tracking |
Disable anonymous tracking and analytics. |
--disable-usage |
Disable anonymous usage reporting. |
--dns-opt value |
Set DNS options for the UCP containers [$DNS_OPT]. |
--dns-search value |
Set custom DNS search domains for the UCP containers [$DNS_SEARCH]. |
--dns value |
Set custom DNS servers for the UCP containers [$DNS]. |
--enable-profiling |
Enable performance profiling. |
--existing-config |
Use the latest existing UCP config during this installation. The install will fail if a config is not found. |
--external-server-cert |
Customize the certificates used by the UCP web server. |
--external-service-lb value |
Set the IP address of the load balancer that published services are expected to be reachable on. |
--force-insecure-tcp |
Force install to continue even with unauthenticated Docker Engine ports. |
--force-minimums |
Force the install/upgrade even if the system does not meet the minimum requirements. |
--host-address value |
The network address to advertise to other nodes. Format: IP address or network interface name [$UCP_HOST_ADDRESS]. |
--iscsiadm-pathvalue value |
Path to the host iscsiadm binary. This option is applicable only when –storage-iscsi is specified. |
--kube-apiserver-port value |
Port for the Kubernetes API server (default: 6443). |
--kv-snapshot-count value |
Number of changes between key-value store snapshots (default: 20000) [$KV_SNAPSHOT_COUNT]. |
--kv-timeout value |
Timeout in milliseconds for the key-value store (default: 5000) [$KV_TIMEOUT]. |
--license value |
Add a license: e.g. –license “$(cat license.lic)” [$UCP_LICENSE]. |
--nodeport-range value |
Allowed port range for Kubernetes services of type NodePort (Default: 32768-35535) (default: “32768-35535”). |
--pod-cidr value |
Kubernetes cluster IP pool for the pods to allocated IP from (Default: 192.168.0.0/16) (default: “192.168.0.0/16”). |
--preserve-certs |
Don’t generate certificates if they already exist. |
--pull value |
Pull UCP images: ‘always’, when ‘missing’, or ‘never’ (default: “missing”). |
--random |
Set the Docker Swarm scheduler to random mode. Used for backwards compatibility. |
--registry-password value |
Password to use when pulling images [$REGISTRY_PASSWORD]. |
--registry-username value |
Username to use when pulling images [$REGISTRY_USERNAME]. |
--san value |
Add subject alternative names to certificates (e.g. –san www1.acme.com –san www2.acme.com) [$UCP_HOSTNAMES]. |
--service-cluster-ip-range value |
Kubernetes Cluster IP Range for Services (default: “10.96.0.0/16”). |
--skip-cloud-provider-check |
Disables checks which rely on detecting which (if any) cloud provider the cluster is currently running on. |
--storage-expt-enabled |
Flag to enable experimental features in Kubernetes storage. |
--storage-iscsi |
Enable ISCSI based Persistent Volumes in Kubernetes. |
--swarm-experimental |
Enable Docker Swarm experimental features. Used for backwards compatibility. |
--swarm-grpc-port value |
Port for communication between nodes (default: 2377). |
--swarm-port value |
Port for the Docker Swarm manager. Used for backwards compatibility (default: 2376). |
--unlock-key value |
The unlock key for this swarm-mode cluster, if one exists. [$UNLOCK_KEY]. |
--unmanaged-cni |
Flag to indicate if cni provider is calico and managed by UCP (calico is the default CNI provider). |
Use this command to check the suitablility of the node for a UCP installation.
docker run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
port-check-server [command options]
Option | Description |
---|---|
--listen-address -l value |
Listen Address (default: “:2376”) |
Use this command to restore a UCP cluster from a backup.
This command installs a new UCP cluster that is populated with the
state of a previous UCP manager node using a tar file generated by
the backup
command. All UCP settings, users, teams and permissions
will be restored from the backup file.
The Restore operation does not alter or recover any containers, networks, volumes or services of an underlying cluster.
The restore command can be performed on any manager node of an
existing cluster. If the current node does not belong in a
cluster, one will be initialized using the value of the --host-address
flag. When restoring on an existing swarm-mode cluster, no previous
UCP components must be running on any node of the cluster. This cleanup
can be performed with the uninstall-ucp
command.
If restore is performed on a different cluster than the one where the backup file was taken on, the Cluster Root CA of the old UCP installation will not be restored. This will invalidate any previously issued Admin Client Bundles and all administrator will be required to download new client bundles after the operation is completed. Any existing Client Bundles for non-admin users will still be fully operational.
By default, the backup tar file is read from stdin. You can also
bind-mount the backup file under /config/backup.tar
, and run the
restore command with the --interactive
flag.
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--interactive, i |
Run in interactive mode and prompt for configuration values. |
--data-path-addr value |
Address or interface to use for data path traffic. |
--force-minimums |
Force the install/upgrade even if the system does not meet the minimum requirements. |
--host-address value |
The network address to advertise to other nodes. Format: IP address or network interface name. |
--passphrase value |
Decrypt the backup tar file with the provided passphrase. |
--san value |
Add subject alternative names to certificates (e.g. –san www1.acme.com –san www2.acme.com). |
--swarm-grpc-port value |
Port for communication between nodes (default: 2377). |
--unlock-key value |
The unlock key for this swarm-mode cluster, if one exists. |
Use this command to create a support dump for specified UCP nodes.
This command creates a support dump file for the specified node(s),
and prints it to stdout. This includes the ID of the UCP components
running on the node. The ID matches what you see when running the
docker info
command while using a client bundle, and is used by
other commands as confirmation.
docker container run --rm \
--name ucp \
--log-driver none \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
support [command options] > docker-support.tgz
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
Use this command to uninstall UCP from this swarm, but preserve the swarm so that your applications can continue running.
After UCP is uninstalled, you can use the docker swarm leave
and docker node rm
commands to remove nodes from the swarm.
Once UCP is uninstalled, you will not be able to join nodes to the swarm unless UCP is installed again.
docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
uninstall-ucp [command options]
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--interactive, i |
Run in interactive mode and prompt for configuration values. |
--id value |
The ID of the UCP instance to uninstall. |
--pull value |
Pull UCP images: always , when missing , or never . |
--purge-config |
Remove UCP configs during uninstallation. |
--registry-password value |
Password to use when pulling images. |
--registry-username value |
Username to use when pulling images. |
Use this command to upgrade the UCP cluster.
Before performing an upgrade, you should perform a backup
using the backup
command.
After upgrading UCP, browse to the UCP web UI and confirm that each node is healthy and that all nodes have been upgraded successfully.
docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
docker/ucp \
upgrade [command options]
Option | Description |
---|---|
--debug, -D |
Enable debug mode |
--jsonlog |
Produce json formatted output for easier parsing. |
--interactive, i |
Run in interactive mode and prompt for configuration values. |
--admin-password value |
The UCP administrator password. |
--admin-username value |
The UCP administrator username. |
--force-minimums |
Force the install/upgrade even if the system does not meet the minimum requirements. |
--host-address value |
Override the previously configured host address with this IP or network interface. |
--id |
The ID of the UCP instance to upgrade. |
--manual-worker-upgrade |
Whether to manually upgrade worker nodes. Defaults to false. |
--pull |
Pull UCP images: always , when missing , or never , |
--registry-password value |
Password to use when pulling images. |
--registry-username value |
Username to use when pulling images. |
Docker Trusted Registry (DTR) is the enterprise-grade image storage solution from Docker. You install it behind your firewall so that you can securely store and manage the Docker images you use in your applications.
Image and job management
DTR can be installed on-premises, or on a virtual private cloud. And with it, you can store your Docker images securely, behind your firewall.
You can use DTR as part of your continuous integration, and continuous delivery processes to build, ship, and run your applications.
DTR has a web user interface that allows authorized users in your organization to browse Docker images and review repository events. It even allows you to see what Dockerfile lines were used to produce the image and, if security scanning is enabled, to see a list of all of the software installed in your images. Additionally, you can now review and audit jobs on the web interface.
Availability
DTR is highly available through the use of multiple replicas of all containers and metadata such that if a machine fails, DTR continues to operate and can be repaired.
Efficiency
DTR has the ability to cache images closer to users to reduce the amount of bandwidth used when pulling Docker images.
DTR has the ability to clean up unreferenced manifests and layers.
Built-in access control
DTR uses the same authentication mechanism as Docker Universal Control Plane. Users can be managed manually or synchronized from LDAP or Active Directory. DTR uses Role Based Access Control (RBAC) to allow you to implement fine-grained access control policies for your Docker images.
Security scanning
DTR has a built-in security scanner that can be used to discover what versions of software are used in your images. It scans each layer and aggregates the results to give you a complete picture of what you are shipping as a part of your stack. Most importantly, it correlates this information with a vulnerability database that is kept up to date through periodic updates. This gives you unprecedented insight into your exposure to known security threats.
Image signing
DTR ships with Notary built in so that you can use Docker Content Trust to sign and verify images. For more information about managing Notary data in DTR see the DTR-specific notary documentation.
Learn about new features, bug fixes, breaking changes, and known issues for each DTR version.
(2020-11-12)
(2020-08-10)
Starting with this release, we moved the location of our offline bundles for DTR from https://packages.docker.com/caas/ to https://packages.mirantis.com/caas/ for the following versions.
Offline bundles for other previous versions of DTR will remain on the docker domain.
Due to infrastructure changes, licenses will no longer auto-update and the relaged screens in DTR have been removed.
(2020-01-28)
unable to cancel request: nil
. (docker/dhe-deploy #10805)Includes a new version of the security scanner which re-enables daily CVE database updates. Following the patch release upgrade, security scans will fail until a new version of the database is provided (if DTR is configured for online updates, this will occur automatically within 24 hours). To trigger an immediate update, (1) access the DTR UI, (2) go to the Security under System settings, and (3) click the Sync database now button. (docker/dhe-deploy #10847)
If DTR is configured for offline updates, download CVE Vulnerability Database for DTR version 2.6.12 or higher.
(2019-11-13)
1.12.12
. (docker/dhe-deploy
#10769)(2019-10-08)
(2019-09-03)
1.12.9
.
(docker/dhe-deploy #10557)(2019-7-17)
dockersearch
API returned incorrect results
when the search query ended in a digit. (docker/dhe-deploy #10434)1.12.7
. (docker/dhe-deploy
#10460)3.9.4
.
(docker/dhe-deploy #10460)docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-6-27)
18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-5-6)
--storage-migrated
option when performing an NFS reconfiguration
via docker run docker/dtr reconfigure --nfs-url ...
.
(docker/dhe-deploy#10246)curl ... /api/v0/admin/settings/registry > storage.json
.keep_metadata: true
as a top-level key in the JSON
you just created and modify it to contain your new storage
settings.curl -X PUT .../api/v0/admin/settings/registry -d @storage.json
.18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-4-11)
Users
tab from the side navigation
#1022218.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.:
(2019-3-28)
--storage-migrated
option to reconfigure with migrated
content when moving content to a new NFS URL. (ENGDTR-794)docker pull
would fail on images that have been pushed to the
repository after you upgrade to 2.5 and opt into garbage collection.
This also applied when upgrading from 2.5 to 2.6. The issue has been
fixed in DTR 2.6.4. (ENGDTR-330 and docker/dhe-deploy #10105)18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-2-28)
18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)--nfs-storage-url
will assume you are switching to a fresh
storage backend and will wipe your existing tags (ENGDTR-794). See
Reconfigure Using a Local NFS Volume
and Restore to a Local NFS Volume
for Docker’s recommended recovery strategies.Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-1-29)
18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)--nfs-storage-url
will assume you are switching to a fresh
storage backend and will wipe your existing tags (ENGDTR-794). See
Reconfigure Using a Local NFS Volume
and Restore to a Local NFS Volume
for Docker’s recommended recovery strategies.Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2019-01-09)
18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)--nfs-storage-url
will assume you are switching to a fresh
storage backend and will wipe your existing tags (ENGDTR-794). See
Reconfigure Using a Local NFS Volume
and Restore to a Local NFS Volume
for Docker’s recommended recovery strategies.Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store
migration.(2018-11-08)
Web Interface
CLI
--async-nfs
and --nfs-options
when installing or
reconfiguring NFS for external storage. See docker/dtr
install and docker/dtr
reconfigure for more
details.--dtr-use-default-storage
, --dtr-storage-volume
, or
--nfs-storage-url
. This ensures recovery of the configured
storage setting when the backup was created. See docker/dtr
restore for more details.API
GET /api/v0/imagescan/scansummary/repositories/{namespace}/{reponame}/
{tag}/export
endpoint. Specify text/csv
as an Accept request HTTP header.GET /api/v0/repositories/{namespace}/{reponame}/pruningPolicies
POST /api/v0/repositories/{namespace}/{reponame}/pruningPolicies
GET /api/v0/repositories/{namespace}/{reponame}/pruningPolicies/test
GET
/api/v0/repositories/{namespace}/{reponame}/pruningPolicies/{pruningpolicyid}
GET
/api/v0/repositories/{namespace}/{reponame}/pruningPolicies/{pruningpolicyid}
PUT
/api/v0/repositories/{namespace}/{reponame}/pruningPolicies/{pruningpolicyid}
DELETE
/api/v0/repositories/{namespace}/{reponame}/pruningPolicies/{pruningpolicyid}
See Docker Trusted Registry API for endpoint details and example usage. Alternatively, you can log in to the DTR web interface and select API from the bottom left navigation pane.
18.09
to version 18.09
or greater. For
DTR-specific changes, see 2.5 to 2.6
upgrade.docker/imagefs
is
currently broken. (docker/dhe-deploy #9490)--nfs-storage-url
will assume you are switching to a fresh
storage backend and will wipe your existing tags (ENGDTR-794). See
Reconfigure Using a Local NFS Volume
and Restore to a Local NFS Volume
for Docker’s recommended recovery strategies.Image promoted from repository
events, a
webhook notification is triggered twice during an image promotion
when scanning is enabled on a repository. (docker/dhe-deploy
#9685)2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This is
necessary for online garbage collection. If the three system
attempts fail, you will have to retrigger the
metadatastoremigration
job manually. Learn about manual
metadata store migration.GET /api/v0/imagescan/repositories/{namespace}/{reponame}/{tag}
is deprecated in favor of
GET
/api/v0/imagescan/scansummary/repositories/{namespace}/{reponame}/{tag}
.DELETE /api/v0/accounts/{namespace}/repositories
DELETE
/api/v0/repositories/{namespace}/{reponame}/manifests/{reference}
enableManifestLists
field on the
POST /api/v0/repositories/{namespace}
endpoint will be removed
in DTR 2.7. See Deprecation Notice for
more details.Docker Trusted Registry (DTR) is a containerized application that runs on a Docker Universal Control Plane cluster.
Once you have DTR deployed, you use your Docker CLI client to login, push, and pull images.
For high-availability you can deploy multiple DTR replicas, one on each UCP worker node.
All DTR replicas run the same set of services and changes to their configuration are automatically propagated to other replicas.
When you install DTR on a node, the following containers are started:
Name | Description |
---|---|
dtr-api-<replica_id> |
Executes the DTR business logic. It serves the DTR web application and API |
dtr-garant-<replica_id> |
Manages DTR authentication |
dtr-jobrunner-<replica_id> |
Runs cleanup jobs in the background |
dtr-nginx-<replica_id> |
Receives http and https requests and proxies them to other DTR components. By default it listens to ports 80 and 443 of the host |
dtr-notary-server-<replica_id> |
Receives, validates, and serves content trust metadata, and is consulted when pushing or pulling to DTR with content trust enabled |
dtr-notary-signer-<replica_id> |
Performs server-side timestamp and snapshot signing for content trust metadata |
dtr-registry-<replica_id> |
Implements the functionality for pulling and pushing Docker images. It also handles how images are stored |
dtr-rethinkdb-<replica_id> |
A database for persisting repository metadata |
dtr-scanningstore-<replica_id> |
Stores security scanning data |
All these components are for internal use of DTR. Don’t use them in your applications.
To allow containers to communicate, when installing DTR the following networks are created:
Name | Type | Description |
---|---|---|
dtr-ol |
overlay |
Allows DTR components running on different nodes to communicate, to replicate DTR data |
DTR uses these named volumes for persisting data:
Volume name | Description |
---|---|
dtr-ca-<replica_id> |
Root key material for the DTR root CA that issues certificates |
dtr-notary-<replica_id> |
Certificate and keys for the Notary components |
dtr-postgres-<replica_id> |
Vulnerability scans data |
dtr-registry-<replica_id> |
Docker images data, if DTR is configured to store images on the local filesystem |
dtr-rethink-<replica_id> |
Repository metadata |
dtr-nfs-registry-<replica_id> |
Docker images data, if DTR is configured to store images on NFS |
You can customize the volume driver used for these volumes, by creating the volumes before installing DTR. During the installation, DTR checks which volumes don’t exist in the node, and creates them using the default volume driver.
By default, the data for these volumes can be found at
/var/lib/docker/volumes/<volume-name>/_data
.
By default, Docker Trusted Registry stores images on the filesystem of the node where it is running, but you should configure it to use a centralized storage backend.
DTR supports these storage back ends:
DTR has a web UI where you can manage settings and user permissions.
You can push and pull images using the standard Docker CLI client or other tools that can interact with a Docker registry.
Docker Trusted Registry can be installed on-premises or on the cloud. Before installing, be sure your infrastructure has these requirements.
You can install DTR on-premises or on a cloud provider. To install DTR, all nodes must:
Note that Windows container images are typically larger than Linux ones and for this reason, you should consider provisioning more local storage for Windows nodes and for DTR setups that will store Windows container images.
When installing DTR on a node, make sure the following ports are open on that node:
Direction | Port | Purpose |
---|---|---|
in | 80/tcp | Web app and API client access to DTR. |
in | 443/tcp | Web app and API client access to DTR. |
These ports are configurable when installing DTR.
Docker Enterprise Edition is a software subscription that includes three products:
Learn more about the maintenance lifecycle for these products.
Docker Trusted Registry (DTR) is a containerized application that runs on a swarm managed by the Universal Control Plane (UCP). It can be installed on-premises or on a cloud infrastructure.
Before installing DTR, make sure your infrastructure meets the DTR system requirements that DTR needs to run.
Since DTR requires Docker Universal Control Plane (UCP) to run, you need to ref:install UCP for production<ucp_install> on all the nodes where you plan to install DTR.
DTR needs to be installed on a worker node that is being managed by UCP. You cannot install DTR on a standalone Docker Engine.
Once UCP is installed, navigate to the UCP web UI. In the Admin Settings, choose Docker Trusted Registry.
After you configure all the options, you’ll have a snippet that you can use to deploy DTR. It should look like this:
# Pull the latest version of DTR
$ docker pull docker/dtr:2.6.15
# Install DTR
$ docker run -it --rm \
docker/dtr:2.6.15 install \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
You can run that snippet on any node where Docker is installed. As an example you can SSH into a UCP node and run the DTR installer from there. By default the installer runs in interactive mode and prompts you for any additional information that is necessary.
By default DTR is deployed with self-signed certificates, so your UCP deployment might not be able to pull images from DTR. Use the –dtr-external-url <dtr-domain>:<port> optional flag while deploying DTR, so that UCP is automatically reconfigured to trust DTR. Since HSTS (HTTP Strict-Transport-Security) header is included in all API responses, make sure to specify the FQDN (Fully Qualified Domain Name) of your DTR, or your browser may refuse to load the web interface.
In your browser, navigate to the Docker Universal Control Plane web interface, and navigate to Shared Resources > Stacks. DTR should be listed as an application.
You can also access the DTR web interface, to make sure it is working. In your browser, navigate to the address where you installed DTR.
After installing DTR, you should configure:
To perform these configurations, navigate to the Settings page of DTR.
Now that you have a working installation of DTR, you should test that you can push and pull images to it:
This step is optional.
To set up DTR for high availability, you can add more replicas to your DTR cluster. Adding more replicas allows you to load-balance requests across all replicas, and keep DTR working if a replica fails.
For high-availability you should set 3, 5, or 7 DTR replicas. The nodes where you’re going to install these replicas also need to be managed by UCP.
To add replicas to a DTR cluster, use the docker/dtr join command:
Load your UCP user bundle.
Run the join command.
When you join a replica to a DTR cluster, you need to specify the ID of a replica that is already part of the cluster. You can find an existing replica ID by going to the Shared Resources > Stacks page on UCP.
Then run:
docker run -it --rm \
docker/dtr:2.7.6 join \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
Caution
–ucp-node
The <ucp-node-name>
following the --ucp-node
flag is the
target node to install the DTR replica. This is NOT the UCP
Manager URL.
Check that all replicas are running.
In your browser, navigate to the Docker Universal Control Plane web interface, and navigate to Shared Resources > Stacks. All replicas should be displayed.
The procedure to install Docker Trusted Registry on a host is the same, whether that host has access to the internet or not.
The only difference when installing on an offline host, is that instead of pulling the UCP images from Docker Hub, you use a computer that is connected to the internet to download a single package with all the images. Then you copy that package to the host where you’ll install DTR.
Use a computer with internet access to download a package with all DTR images:
$ wget <package-url> -O dtr.tar.gz
Now that you have the package in your local machine, you can transfer it to the machines where you want to install DTR.
For each machine where you want to install DTR:
Copy the DTR package to that machine.
$ scp dtr.tar.gz <user>@<host>
Use SSH to log in to the hosts where you transferred the package.
Load the DTR images.
Once the package is transferred to the hosts, you can use the
docker load
command to load the Docker images from the tar
archive:
$ docker load -i dtr.tar.gz
Now that the offline hosts have all the images needed to install DTR, you can install DTR on that host.
DTR makes outgoing connections to:
All of these uses of online connections are optional. You can choose to disable or not use any or all of these features on the admin settings page.
DTR uses semantic versioning and Docker aims to achieve specific guarantees while upgrading between versions. While downgrades are not supported, Docker supports upgrades according to the following rules:
Description | From | To | Supported |
---|---|---|---|
patch upgrade | x.y.0 | x.y.1 | yes |
skip patch version | x.y.0 | x.y.2 | yes |
patch downgrade | x.y.2 | x.y.1 | no |
minor upgrade | x.y.* | x.y+1.* | yes |
skip minor version | x.y.* | x.y+2.* | no |
minor downgrade | x.y.* | x.y-1.* | no |
skip major version | x.. | x+2.. | no |
major downgrade | x.. | x-1.. | no |
major upgrade | x.y.z | x+1.0.0 | yes |
major upgrade skipping minor version | x.y.z | x+1.y+1.z | no |
There may be at most a few seconds of interruption during the upgrade of a DTR cluster. Schedule the upgrade to take place outside of peak hours to avoid any business impacts.
There are [important changes to the upgrade process](/ee/upgrade) that, if not correctly followed, can have impact on the availability of applications running on the Swarm during upgrades. These constraints impact any upgrades coming from any version before 18.09 to version 18.09 or greater. Additionally, to ensure high availability during the DTR upgrade, you can also drain the DTR replicas and move their workloads to updated workers. To do this, you can join new workers as DTR replicas to your existing cluster and then remove the old replicas. See docker/dtr join<:ref:`join command<dtr-cli-join> and docker/dtr remove for command options and details.
Before starting your upgrade, make sure that:
Make sure you are running DTR 2.5. If this is not the case, upgrade your installation to the 2.5 version.
Then pull the latest version of DTR:
docker pull docker/dtr:2.6.8
Make sure you have at least 16GB of available RAM on the node you are running the upgrade on. If the DTR node does not have access to the Internet, you can follow the Install DTR offline documentation to get the images.
Once you have the latest image on your machine (and the images on the target nodes if upgrading offline), run the upgrade command.
Note
The upgrade command can be run from any available node, as UCP is aware of which worker nodes have replicas.
docker run -it --rm \
docker/dtr:2.6.8 upgrade
--ucp-insecure-tls
By default, the upgrade command runs in interactive mode and prompts you for any necessary information. You can also check the upgrade reference page<dtr-cli-upgrade for other existing flags.
The upgrade command will start replacing every container in your DTR cluster, one replica at a time. It will also perform certain data migrations. If anything fails or the upgrade is interrupted for any reason, you can rerun the upgrade command and it will resume from where it left off.
When upgrading from 2.5
to 2.6
, the system will run a
metadatastoremigration
job after a successful upgrade. This involves
migrating the blob links for your images, which is necessary for online garbage
collection. With 2.6
, you can log into the DTR web interface and navigate
to System > Job Logs to check the status of the
metadatastoremigration
job. Refer to Audit Jobs via the Web
Interface<dtr-manage-jobs-audit-jobs-via-ui> for more details.
Garbage collection is disabled while the migration is running. In the
case of a failed metadatastoremigration
, the system will retry
twice.
If the three attempts fail, you will have to retrigger the metadatastoremigration job manually. To do so, send a POST request to the /api/v0/jobs endpoint:
curl https://<dtr-external-url>/api/v0/jobs -X POST \
-u username:accesstoken -H 'Content-Type':'application/json' -d \
'{"action": "metadatastoremigration"}'
Alternatively, select API from the bottom left navigation pane of the DTR web interface and use the Swagger UI to send your API request.
A patch upgrade changes only the DTR containers and is always safer than a minor version upgrade. The command is the same as for a minor upgrade.
If you have previously deployed a cache, make sure to upgrade the node dedicated for your cache to keep it in sync with your upstream DTR replicas. This prevents authentication errors and other strange behaviors.
After upgrading DTR, you need to redownload the vulnerability database. Learn how to update your vulnerability database<dtr-config-set-up-vulnerability-scans>.
Uninstalling DTR can be done by simply removing all data associated with each replica. To do that, you just run the destroy command once per replica:
docker run -it --rm \
docker/dtr:2.7.6 destroy \
--ucp-insecure-tls
You will be prompted for the UCP URL, UCP credentials, and which replica to destroy.
To see what options are available in the destroy command, check the destroy command reference documentation.
By default, you don’t need to license your Docker Trusted Registry. When installing DTR, it automatically starts using the same license file used on your Docker Universal Control Plane cluster.
However, there are some situations when you have to manually license your DTR installation:
Once you’ve downloaded the license file, you can apply it to your DTR installation. Navigate to the DTR web UI, and then go to the Settings page.
Click the Apply new license button, and upload your new license file.
By default the DTR services are exposed using HTTPS, to ensure all communications between clients and DTR is encrypted. Since DTR replicas use self-signed certificates for this, when a client accesses DTR, their browsers won’t trust this certificate, so the browser displays a warning message.
You can configure DTR to use your own certificates, so that it is automatically trusted by your users’ browser and client tools.
To configure DTR to use your own certificates and keys, go to the DTR web UI, navigate to the Settings page, and scroll down to the Domain section.
Set the DTR domain name and upload the certificates and key:
Finally, click Save for the changes to take effect.
If you’re using certificates issued by a globally trusted certificate authority, any web browser or client tool should now trust DTR. If you’re using an internal certificate authority, you’ll need to configure your system to trust that certificate authority.
By default, users are shared between UCP and DTR, but you have to authenticate separately on the web UI of both applications.
You can configure DTR to have single sign-on (SSO) with UCP, so that users only have to authenticate once.
Note
After configuring single sign-on with DTR, users accessing
DTR via docker login
should create an access
token and use it to authenticate.
When installing DTR, use the docker/dtr install
--dtr-external-url <url>
option to enable SSO. When accessing the DTR web UI,
users are redirected to the UCP login page, and once they are authenticated,
they’re redirected to the URL you provided to --dtr-external-url
.
Use the domain name of DTR, or the domain name of a load balancer, if you’re using one, to load-balance requests across multiple DTR replicas.
In your browser, navigate to the DTR web UI, and choose Settings. In the General tab, scroll to Domain & proxies.
Update the Load balancer / public address field to the url where users should be redirected once they are logged in. Use the domain name of DTR, or the domain name of a load balancer, if you’re using one, to load-balance requests across multiple DTR replicas.
Then enable Use single sign-on.
Once you save, users are redirected to UCP for logging in, and redirected back to DTR once they are authenticated.
Navigate to https://<dtr-url>
and log in with your credentials.
Select System from the left navigation pane, and scroll down to Domain & Proxies.
Update the Load balancer / Public Address field with the external URL where users should be redirected once they are logged in. Click Save to apply your changes.
Toggle Single Sign-on to automatically redirect users to UCP for logging in.
By default DTR uses the local filesystem of the node where it is running to store your Docker images. You can configure DTR to use an external storage backend, for improved performance or high availability.
If your DTR deployment has a single replica, you can continue using the local filesystem for storing your Docker images. If your DTR deployment has multiple replicas, make sure all replicas are using the same storage backend for high availability. Whenever a user pulls an image, the DTR node serving the request needs to have access to that image.
DTR supports the following storage systems:
Note
Some of the previous links are meant to be informative and are not representative of DTR’s implementation of these storage systems.
To configure the storage backend, log in to the DTR web interface as an admin, and navigate to System > Storage.
The storage configuration page gives you the most common configuration
options, but you have the option to upload a configuration file in
.yml
, .yaml
, or .txt
format.
By default, DTR creates a volume named dtr-registry-<replica-id>
to
store your images using the local filesystem. You can customize the name
and path of the volume by using
docker/dtr install --dtr-storage-volume
or
docker/dtr reconfigure --dtr-storage-volume
.
Warning
When running DTR 2.5 (with experimental online garbage collection) and 2.6.0
to 2.6.3, there is an issue with reconfiguring DTR with
–nfs-storage-url which leads to erased tags. Make sure to
back up your DTR metadata before you proceed.
To work around the --nfs-storage-url
flag issue, manually create a
storage volume on each DTR node. If DTR is already installed in your
cluster, reconfigure DTR
with the --dtr-storage-volume
flag using your newly-created volume.
If you’re deploying DTR with high-availability, you need to use NFS or any other centralized storage backend so that all your DTR replicas have access to the same images.
To check how much space your images are utilizing in the local filesystem, SSH into the DTR node and run:
# Find the path to the volume
docker volume inspect dtr-registry-<replica-id>
# Check the disk usage
sudo du -hs \
$(dirname $(docker volume inspect --format '{{.Mountpoint}}' dtr-registry-<dtr-replica>))
{% endraw %}
You can configure your DTR replicas to store images on an NFS partition, so that all replicas can share the same storage backend.
DTR supports Amazon S3 or other storage systems that are S3-compatible like Minio. Learn how to configure DTR with Amazon S3.
Starting in DTR 2.6, switching storage backends initializes a new
metadata store and erases your existing tags. This helps facilitate
online garbage collection, which has been introduced in 2.5 as an
experimental feature. In earlier versions, DTR would subsequently start
a tagmigration
job to rebuild tag metadata from the file layout in
the image layer store. This job has been discontinued for DTR 2.5.x
(with garbage collection) and DTR 2.6, as your storage backend could get
out of sync with your DTR metadata, like your manifests and existing
repositories. As best practice, DTR storage backends and metadata should
always be moved, backed up, and restored together.
In DTR 2.6.4, a new flag, --storage-migrated
, has been added to
docker/dtr reconfigure which lets you indicate
the migration status of your storage data during a reconfigure. If you are not
worried about losing your existing tags, you can skip the recommended steps
below and perform a reconfigure.
Docker recommends the following steps for your storage backend and metadata migration:
Disable garbage collection by selecting “Never” under System > Garbage Collection, so blobs referenced in the backup that you create continue to exist. See Garbage collection for more details. Make sure to keep it disabled while you’re performing the metadata backup and migrating your storage data.
Back up your existing metadata. See docker/dtr backup for CLI command description and options.
Migrate the contents of your current storage backend to the new one you are switching to. For example, upload your current storage data to your new NFS server.
Restore DTR from your backup and specify your new storage backend. See docker/dtr destroy and docker/dtr restore for CLI command descriptions and options.
With DTR restored from your backup and your storage data migrated to your new backend, garbage collect any dangling blobs using the following API request:
curl -u <username>:$TOKEN -X POST "https://<dtr-url>/api/v0/jobs" -H "accept: application/json" -H "content-type: application/json" -d "{ \"action": \"onlinegc_blobs\" }"
On success, you should get a 202 Accepted
response with a job
id
and other related details. This ensures any blobs which are
not referenced in your previously created backup get destroyed.
If you have a long maintenance window, you can skip some steps from above and do the following:
Put DTR in “read-only” mode using the following API request:
curl -u <username>:$TOKEN -X POST "https://<dtr-url>/api/v0/meta/settings" -H "accept: application/json" -H "content-type: application/json" -d "{ \"readOnlyRegistry\": true }"
On success, you should get a 202 Accepted
response.
Migrate the contents of your current storage backend to the new one you are switching to. For example, upload your current storage data to your new NFS server.
Reconfigure DTR while specifying the
--storage-migrated
flag to preserve your existing tags.
Make sure to perform a backup before you change your storage backend when running DTR 2.5 (with online garbage collection) and 2.6.0-2.6.3. If you encounter an issue with lost tags, refer to the following resources:
Upgrade to DTR 2.6.4 and follow best practice for data migration to avoid the wiped tags issue when moving from one NFS server to another.
You can configure DTR to store Docker images on Amazon S3, or other file servers with an S3-compatible API like Cleversafe or Minio.
Amazon S3 and compatible services store files in “buckets”, and users have permissions to read, write, and delete files from those buckets. When you integrate DTR with Amazon S3, DTR sends all read and write operations to the S3 bucket so that the images are persisted there.
Before configuring DTR you need to create a bucket on Amazon S3. To get faster pulls and pushes, you should create the S3 bucket on a region that’s physically close to the servers where DTR is running.
Start by creating a bucket. Then, as a best practice you should create a new IAM user just for the DTR integration and apply an IAM policy that ensures the user has limited permissions.
This user only needs permissions to access the bucket that you’ll use to store images, and be able to read, write, and delete files.
Here’s an example of a user policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:ListAllMyBuckets",
"Resource": "arn:aws:s3:::*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads"
],
"Resource": "arn:aws:s3:::<bucket-name>"
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListBucketMultipartUploads"
],
"Resource": "arn:aws:s3:::<bucket-name>/*"
}
]
}
Once you’ve created a bucket and user, you can configure DTR to use it.
In your browser, navigate to https://<dtr-url
. Select System >
Storage.
Select the S3 option, and fill-in the information about the bucket and user.
Field | Description |
---|---|
Root directory | The path in the bucket where images are stored. |
AWS Region name | The region where the bucket is. Learn more |
S3 bucket name | The name of the bucket to store the images. |
AWS access key | The access key to use to access the S3 bucket. This can be left empty if you’re using an IAM policy. Learn more |
AWS secret key | The secret key to use to access the S3 bucket. This can be left empty if you’re using an IAM policy. |
Region endpoint | The endpoint name for the region you’re using. Learn more |
There are also some advanced settings.
Field | Description |
---|---|
Signature version 4 auth | Authenticate the requests using AWS signature version 4. Learn more |
Use HTTPS | Secure all requests with HTTPS, or make requests in an insecure way |
Skip TLS verification | Encrypt all traffic, but don’t verify the TLS certificate used by the storage backend. |
Root CA certificate | The public key certificate of the root certificate authority that issued the storage backend certificate. |
Once you click Save, DTR validates the configurations and saves the changes.
If you’re using a TLS certificate in your storage backend that’s not globally trusted, you’ll have to configure all Docker Engines that push or pull from DTR to trust that certificate. When you push or pull an image DTR redirects the requests to the storage backend, so if clients don’t trust the TLS certificates of both DTR and the storage backend, they won’t be able to push or pull images. Learn how to configure the Docker client.
And if you’ve configured DTR to skip TLS verification, you also need to configure all Docker Engines that push or pull from DTR to skip TLS verification. You do this by adding DTR to the `list of insecure registries when starting Docker.
DTR supports the following S3 regions:
Region |
---|
us-east-1 |
us-east-2 |
us-west-1 |
us-west-2 |
eu-west-1 |
eu-west-2 |
eu-central-1 |
ap-south-1 |
ap-southeast-1 |
ap-southeast-2 |
ap-northeast-1 |
ap-northeast-2 |
sa-east-1 |
cn-north-1 |
us-gov-west-1 |
ca-central-1 |
When running 2.5.x (with experimental garbage collection) or 2.6.0-2.6.4, there is an issue with changing your S3 settings on the web interface which leads to erased metadata. Make sure to back up your DTR metadata before you proceed.
To restore DTR using your previously configured S3
settings,
use docker/dtr restore
with --dtr-use-default-storage
to keep
your metadata.
You can configure DTR to store Docker images in an NFS directory. Starting in DTR 2.6, changing storage backends involves initializing a new metadatastore instead of reusing an existing volume. This helps facilitate online garbage collection. See changes to NFS reconfiguration below if you have previously configured DTR to use NFS.
Before installing or configuring DTR to use an NFS directory, make sure that:
To confirm that the hosts can connect to the NFS server, try to list the directories exported by your NFS server:
showmount -e <nfsserver>
You should also try to mount one of the exported directories:
mkdir /tmp/mydir && sudo mount -t nfs <nfs server>:<directory> /tmp/mydir
One way to configure DTR to use an NFS directory is at install time:
docker run -it --rm docker/dtr:2.7.5 install \
--nfs-storage-url <nfs-storage-url> \
<other options>
Use the format nfs://<nfs server>/<directory>
for the NFS storage
URL. To support NFS v4, you can now specify additional options when
running docker/dtr install with
--nfs-storage-url
.
When joining replicas to a DTR cluster, the replicas will pick up your storage configuration, so you will not need to specify it again.
To support NFS v4, more NFS options have been added to the CLI. See New Features for 2.6.0 - CLI for updates to docker/dtr reconfigure.
Warning
When running DTR 2.5 (with experimental online garbage collection) and 2.6.0
to 2.6.3, there is an issue with reconfiguring and restoring DTR with
–nfs-storage-url which leads to erased tags. Make sure to
back up your DTR metadata before you proceed.
To work around the --nfs-storage-url
flag issue, manually create a
storage volume. If DTR is already installed in your cluster, reconfigure
DTR<dtr-cli-reconfigure> with the --dtr-storage-volume
flag using your
newly-created volume.
See Reconfigure Using a Local NFS Volume for Docker’s recommended recovery strategy.
In DTR 2.6.4, a new flag, --storage-migrated
, has been added to
docker/dtr reconfigure which lets you indicate the
migration status of your storage data during a reconfigure. Upgrade to
2.6.4<dtr-cli-upgrade and follow Best practice for data migration
in 2.6.4 when switching storage backends.
The following shows you how to reconfigure DTR using an NFSv4 volume as a
storage backend:
docker run --rm -it \
docker/dtr:{{ page.dtr_version}} reconfigure \
--ucp-url <ucp_url> \
--ucp-username <ucp_username> \
--nfs-storage-url <dtr-registry-nf>
--async-nfs
--storage-migrated
To reconfigure DTR to stop using NFS storage, leave the
--nfs-storage-url
option blank:
docker run -it --rm docker/dtr:{{ page.dtr_version}} reconfigure \
--nfs-storage-url ""
Docker Trusted Registry is designed to scale horizontally as your usage increases. You can add more replicas to make DTR scale to your demand and for high availability.
All DTR replicas run the same set of services and changes to their configuration are automatically propagated to other replicas.
To make DTR tolerant to failures, add additional replicas to the DTR cluster.
DTR replicas | Failures tolerated |
---|---|
1 | 0 |
3 | 1 |
5 | 2 |
7 | 3 |
When sizing your DTR installation for high-availability, follow these rules of thumb:
To have high-availability on UCP and DTR, you need a minimum of:
You also need to configure the DTR replicas to share the same object storage.
To add replicas to an existing DTR deployment:
Use ssh to log into any node that is already part of UCP.
Run the DTR join command:
docker run -it --rm \
docker/dtr:2.7.5 join \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
Where the --ucp-node
is the hostname of the UCP node where you
want to deploy the DTR replica. --ucp-insecure-tls
tells the
command to trust the certificates used by UCP.
If you have a load balancer, add this DTR replica to the load balancing pool.
To remove a DTR replica from your deployment:
Use ssh to log into any node that is part of UCP.
Run the DTR remove command:
docker run -it --rm \
docker/dtr:2.7.5 remove \
--ucp-insecure-tls
You will be prompted for:
If you’re load-balancing user requests across multiple DTR replicas, don’t forget to remove this replica from the load balancing pool.
Once you’ve joined multiple DTR replicas nodes for high-availability, you can configure your own load balancer to balance user requests across all replicas.
This allows users to access DTR using a centralized domain name. If a replica goes down, the load balancer can detect that and stop forwarding requests to it, so that the failure goes unnoticed by users.
DTR exposes several endpoints you can use to assess if a DTR replica is healthy or not:
/_ping
: Is an unauthenticated endpoint that checks if the DTR
replica is healthy. This is useful for load balancing or other
automated health check tasks./nginx_status
: Returns the number of connections being handled by
the NGINX front-end used by DTR./api/v0/meta/cluster_status
: Returns extensive information about
all DTR replicas.DTR does not provide a load balancing service. You can use an on-premises or cloud-based load balancer to balance requests across multiple DTR replicas.
Important
Additional load balancer requirements for UCP
If you are also using UCP, there are additional requirements if you plan to load balance both UCP and DTR using the same load balancer.
You can use the unauthenticated /_ping
endpoint on each DTR replica,
to check if the replica is healthy and if it should remain in the load
balancing pool or not.
Also, make sure you configure your load balancer to:
Host
HTTP header correctly.The /_ping
endpoint returns a JSON object for the replica being
queried of the form:
{
"Error": "error message",
"Healthy": true
}
A response of "Healthy": true
means the replica is suitable for
taking requests. It is also sufficient to check whether the HTTP status
code is 200.
An unhealthy replica will return 503 as the status code and populate
"Error"
with more details on any one of these services:
Note that this endpoint is for checking the health of a single replica. To get the health of every replica in a cluster, querying each replica individually is the preferred way to do it in real time.
Use the following examples to configure your load balancer for DTR.
You can deploy your load balancer using:
This page explains how to set up and enable Docker Security Scanning on an existing installation of Docker Trusted Registry.
These instructions assume that you have already installed Docker Trusted Registry, and have access to an account on the DTR instance with administrator access.
Before you begin, make sure that you or your organization has purchased a DTR license that includes Docker Security Scanning, and that your Docker ID can access and download this license from the Docker Hub.
If you are using a license associated with an individual account, no
additional action is needed. If you are using a license associated with
an organization account, you may need to make sure your Docker ID is a
member of the Owners
team. Only Owners
team members can download
license files for an Organization.
If you will be allowing the Security Scanning database to update itself
automatically, make sure that the server hosting your DTR instance can
access https://dss-cve-updates.docker.com/
on the standard https
port 443.
If your DTR instance already has a license that includes Security Scanning, skip this step and proceed to enable DTR Security Scanning.
Tip
To check if your existing DTR license includes scanning, navigate to the DTR Settings page, and click Security. If an “Enable scanning” toggle appears, the license includes scanning.
If your current DTR license doesn’t include scanning, you must first download the new license.
Log in to the Docker Hub using a Docker ID with access to the license you need.
In the top right corner, click your user account icon, and select My Content.
Locate Docker Enterprise Edition in the content list, and click Setup.
Click License Key to download the license.
Next, install the new license on the DTR instance.
Log in to your DTR instance using an administrator account.
Click Settings in the left navigation.
On the General tab click Apply new license.
A file browser dialog appears.
Navigate to where you saved the license key (.lic
) file, select
it, and click Open.
Proceed to enable DTR Security Scanning.
To enable security scanning in DTR:
Log in to your DTR instance with an administrator account.
Click Settings in the left navigation.
Click the Security tab.
Note
If you see a message on this tab telling you to contact your Docker sales representative, then the license installed on this DTR instance does not include Docker Security Scanning. Check that you have purchased Security Scanning, and that the DTR instance is using the latest license file.
Click the Enable scanning toggle so that it turns blue and says “on”.
Next, provide a security database for the scanner. Security scanning will not function until DTR has a security database to use.
By default, security scanning is enabled in Online mode. In this
mode, DTR attempts to download a security database from a Docker
server. If your installation cannot access
https://dss-cve-updates.docker.com/
you must manually upload a
.tar
file containing the security database.
Online
mode, the DTR instance will contact a
Docker server, download the latest vulnerability database, and
install it. Scanning can begin once this process completes.Offline
mode, use the instructions in Update
CVE database - offline
mode to upload an initial
security database.By default when Security Scanning is enabled, new repositories will
automatically scan on docker push
. If you had existing repositories
before you enabled security scanning, you might want to change
repository scanning behavior.
Two modes are available when Security Scanning is enabled:
Scan on push & Scan manually
: the image is re-scanned on each
docker push
to the repository, and whenever a user with write
access clicks the Start Scan links or Scan button.Scan manually
: the image is scanned only when a user with
write
access clicks the Start Scan links or Scan button.By default, new repositories are set to
Scan on push & Scan manually
, but you can change this setting during
repository creation.
Any repositories that existed before scanning was enabled are set to
Scan manually
mode by default. If these repositories are still in
use, you can change this setting from each repository’s Settings
page.
Note
To change an individual repository’s scanning mode, you
must have write
or admin
access to the repo.
To change an individual repository’s scanning mode:
Docker Security Scanning indexes the components in your DTR images and compares them against a known CVE database. When new vulnerabilities are reported, Docker Security Scanning matches the components in new CVE reports to the indexed components in your images, and quickly generates an updated report.
Users with administrator access to DTR can check when the CVE database was last updated from the Security tab in the DTR Settings pages.
By default Docker Security Scanning checks automatically for updates to the vulnerability database, and downloads them when available. If your installation does not have access to the public internet, use the Offline mode instructions below.
To ensure that DTR can access these updates, make sure that the host can
reach https://dss-cve-updates.docker.com/
on port 443 using https.
DTR checks for new CVE database updates at 3:00 AM UTC every day. If an update is found it is downloaded and applied without interrupting any scans in progress. Once the update is complete, the security scanning system looks for new vulnerabilities in the indexed components.
To set the update mode to Online:
Your choice is saved automatically.
Tip
DTR also checks for CVE database updates when scanning is first enabled, and when you switch update modes. If you need to check for a CVE database update immediately, you can briefly switch modes from online to offline and back again.
To update the CVE database for your DTR instance when it cannot contact
the update server, you download and install a .tar
file that
contains the database updates. To download the file:
Log in to Docker Hub.
If you are a member of an Organization managing licenses using Docker
Hub, make sure your account is a member of the Owners
team. Only
Owners can view and manage licenses and other entitlements for
Organizations from Docker Hub.
In the top right corner, click your user account icon, and select My Content.
If necessary, select an organization account from the Accounts menu at the upper right.
Locate your Docker EE Advanced subscription or trial.
Click Setup button.
Click Download CVE Vulnerability Database link to download the database file.
If you run into problems, contact us at nautilus-feedback@docker.com for the file.
To manually update the DTR CVE database from a .tar
file:
.tar
file that you received, and click
Open.DTR installs the new CVE database, and begins checking already indexed images for components that match new or updated vulnerabilities.
Tip
The Upload button is unavailable while DTR applies CVE database updates.
To change the update mode:
Your choice is saved automatically.
The further away you are from the geographical location where DTR is deployed, the longer it will take to pull and push images. This happens because the files being transferred from DTR to your machine need to travel a longer distance, across multiple networks.
To decrease the time to pull an image, you can deploy DTR caches geographically closer to users.
Caches are transparent to users, since users still log in and pull images using the DTR URL address. DTR checks if users are authorized to pull the image, and redirects the request to the cache.
In this example, DTR is deployed on a datacenter in the United States, and a cache is deployed in the Asia office.
Users in the Asia office update their user profile within DTR to fetch from the cache in their office. They pull an image using:
# Log in to DTR
docker login dtr.example.org
# Pull image
docker image pull dtr.example.org/website/ui:3-stable
DTR authenticates the request and checks if the user has permission to pull the image they are requesting. If they have permissions, they get an image manifest containing the list of image layers to pull and redirecting them to pull the images from the Asia cache.
When users request those image layers from the Asia cache, the cache pulls them from DTR and keeps a copy that can be used to serve to other users without having to pull the image layers from DTR again.
Use caches if you:
If you need users to be able to push images faster, or you want to implement RBAC policies based on different regions, do not use caches. Instead, deploy multiple DTR clusters and implement mirroring policies between them.
With mirroring policies you can set up a development pipeline where images are automatically pushed between different DTR repositories, or across DTR deployments.
As an example you can set up a development pipeline with three different stages. Developers can push and pull images from the development environment, only pull from QA, and have no access to Production.
With multiple DTR deployments you can control the permissions developers have for each deployment, and you can create policies to automatically push images from one deployment to the next. Learn more about deployment policies.
The main reason to use a DTR cache is so that users can pull images from a service that’s geographically closer to them.
In this example a company has developers spread across three locations: United States, Asia, and Europe. Developers working in the US office can pull their images from DTR without problem, but developers in the Asia and Europe offices complain that it takes them a long time to pulls images.
To address that, you can deploy DTR caches in the Asia and Europe offices, so that developers working from there can pull images much faster.
To deploy the DTR caches for this scenario, you need three datacenters:
Both caches are configured to fetch images from DTR.
Before deploying a DTR cache in a datacenter, make sure you:
If you only plan on running a DTR cache on this datacenter, you just need Docker EE Basic, which only includes the Docker Engine.
If you plan on running other workloads on this datacenter, consider deploying Docker EE Standard or Advanced. This way you can enforce fine-grain control over cluster resources, and makes it easier to monitor and manage your applications.
You can customize the port used by the DTR cache, so you’ll have to configure your firewall rules to make sure users can access the cache using the port you chose.
By default the documentation guides you in deploying caches that are exposed on port 443/TCP using the swarm routing mesh.
This example guides you in deploying a DTR cache, assuming that you’ve got a DTR deployment up and running. It also assumes that you’ve provisioned multiple nodes and joined them into a swarm.
The DTR cache is going to be deployed as a Docker service, so that Docker automatically takes care of scheduling and restarting the service if something goes wrong.
We’ll manage the cache configuration using a Docker configuration, and the TLS certificates using Docker secrets. This allows you to manage the configurations securely and independently of the node where the cache is actually running.
To make sure the DTR cache is performant, it should be deployed on a node dedicated just for it. Start by labelling the node where you want to deploy the cache, so that you target the deployment to that node.
Use SSH to log in to a manager node of the swarm where you want to deploy the DTR cache. If you’re using UCP to manage that swarm, use a client bundle to configure your Docker CLI client to connect to the swarm.
docker node update --label-add dtr.cache=true <node-hostname>
Create a file structure that looks like this:
├── docker-stack.yml # Stack file to deploy cache with a single command
├── config.yml # The cache configuration file
└── certs
├── cache.cert.pem # The cache public key certificate
├── cache.key.pem # The cache private key
└── dtr.cert.pem # DTR CA certificate
Then add the following content to each of the files:
With this configuration, the cache fetches image layers from DTR and keeps a local copy for 24 hours. After that, if a user requests that image layer, the cache fetches it again from DTR.
The cache is configured to persist data inside its container. If something goes wrong with the cache service, Docker automatically redeploys a new container, but previously cached data is not persisted. You can customize the storage parameters, if you want to store the image layers using a persistent storage backend.
Also, the cache is configured to use port 443. If you’re already using that port in the swarm, update the deployment and configuration files to use another port. Don’t forget to create firewall rules for the port you choose.
Now that everything is set up, you can deploy the cache by running:
docker stack deploy --compose-file docker-stack.yml dtr-cache
You can check if the cache has been successfully deployed by running:
docker stack ps dtr-cache
Docker should show the dtr-cache stack is running.
Now that you’ve deployed a cache, you need to configure DTR to know
about it. This is done using the POST /api/v0/content_caches
API.
You can use the DTR interactive API documentation to use this API.
In the DTR web UI, click the top-right menu, and choose API docs.
Navigate to the POST /api/v0/content_caches
line and click it to
expand. In the body field include:
{
"name": "region-asia",
"host": "https://<cache-url>:<cache-port>"
}
Click the Try it out! button to make the API call.
Now that you’ve registered the cache with DTR, users can configure their user profile to pull images from DTR or the cache.
In the DTR web UI, navigate to your Account, click the Settings tab, and change the Content Cache settings to use the cache you deployed.
If you need to set this for multiple users at the same time, use the
/api/v0/accounts/{username}/settings
API endpoint.
Now when you pull images, you’ll be using the cache.
To validate that the cache is working as expected:
To validate that the cache is actually serving your request, and to troubleshoot misconfigurations, check the logs for the cache service by running:
docker service logs --follow dtr-cache_cache
The most common causes of configuration are due to TLS authentication:
When this happens, check the cache logs to troubleshoot the misconfiguration.
The certificates and private keys are now managed by Docker in a secure way. Don’t forget to delete sensitive files you’ve created on disk, like the private keys for the cache:
rm -rf certs
This example guides you through deploying a DTR cache, assuming that you’ve got a DTR deployment up and running. The below guide has been tested on Universal Control Plane 3.1, however it should work on any Kubernetes Cluster 1.8 or higher.
The DTR cache is going to be deployed as a Kubernetes Deployment, so that Kubernetes automatically takes care of scheduling and restarting the service if something goes wrong.
We’ll manage the cache configuration using a Kubernetes Config Map, and the TLS certificates using Kubernetes secrets. This allows you to manage the configurations securely and independently of the node where the cache is actually running.
At the end of this exercise you should have the following file structure on your workstation:
├── dtrcache.yaml # Yaml file to deploy cache with a single command
├── config.yaml # The cache configuration file
└── certs
├── cache.cert.pem # The cache public key certificate, including any intermediaries
├── cache.key.pem # The cache private key
└── dtr.cert.pem # DTR CA certificate
The DTR cache will be deployed with a TLS endpoint. For this you will need to generate a TLS ceritificate and key from a certificate authority. The way you expose the DTR Cache will change the SANs required for this certificate.
For example:
On your workstation, create a directory called certs
. Within it
place the newly created certificate cache.cert.pem
and key
cache.key.pem
for your DTR cache. Also place the certificate
authority (including any intermedite certificate authorities) of the
certificate from your DTR deployment. This could be sourced from the
main DTR deployment using curl.
$ curl -s https://<dtr-fqdn>/ca -o certs/dtr.cert.pem`.
The DTR Cache will take its configuration from a file mounted into the container. Below is an example configuration file for the DTR Cache. This yaml should be customised for your environment with the relevant external dtr cache, worker node or external loadbalancer FQDN.
With this configuration, the cache fetches image layers from DTR and keeps a local copy for 24 hours. After that, if a user requests that image layer, the cache will fetch it again from DTR.
The cache, by default, is configured to store image data inside its container. Therefore if something goes wrong with the cache service, and Kubernetes deploys a new pod, cached data is not persisted. Data will not be lost as it is still stored in the primary DTR. You can customize the storage parameters, if you want the cached images to be backended by persistent storage.
Note
Kubernetes Peristent Volumes or Persistent Volume Claims would have to be used to provide persistent backend storage capabilities for the cache.
cat > config.yaml <<EOF
version: 0.1
log:
level: info
storage:
delete:
enabled: true
filesystem:
rootdirectory: /var/lib/registry
http:
addr: 0.0.0.0:443
secret: generate-random-secret
host: https://<external-fqdn-dtrcache> # Could be DTR Cache / Loadbalancer / Worker Node external FQDN
tls:
certificate: /certs/cache.cert.pem
key: /certs/cache.key.pem
middleware:
registry:
- name: downstream
options:
blobttl: 24h
upstreams:
- https://<dtr-url> # URL of the Main DTR Deployment
cas:
- /certs/dtr.cert.pem
EOF
The Kubernetes Manifest file to deploy the DTR Cache is independent of how you choose to expose the DTR cache within your environment. The below example has been tested to work on Universal Control Plane 3.1, however it should work on any Kubernetes Cluster 1.8 or higher.
cat > dtrcache.yaml <<EOF
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: dtr-cache
namespace: dtr
spec:
replicas: 1
selector:
matchLabels:
app: dtr-cache
template:
metadata:
labels:
app: dtr-cache
annotations:
seccomp.security.alpha.kubernetes.io/pod: docker/default
spec:
containers:
- name: dtr-cache
image: docker/dtr-content-cache:2.7.5
command: ["bin/sh"]
args:
- start.sh
- /config/config.yaml
ports:
- name: https
containerPort: 443
volumeMounts:
- name: dtr-certs
readOnly: true
mountPath: /certs/
- name: dtr-cache-config
readOnly: true
mountPath: /config
volumes:
- name: dtr-certs
secret:
secretName: dtr-certs
- name: dtr-cache-config
configMap:
defaultMode: 0666
name: dtr-cache-config
EOF
At this point you should have a file structure on your workstation which looks like this:
├── dtrcache.yaml # Yaml file to deploy cache with a single command
├── config.yaml # The cache configuration file
└── certs
├── cache.cert.pem # The cache public key certificate
├── cache.key.pem # The cache private key
└── dtr.cert.pem # DTR CA certificate
You will also need the kubectl
command line tool configured to talk
to your Kubernetes cluster, either through a Kubernetes Config file or a
Universal Control Plane client bundle.
First we will create a Kubernetes namespace to logically separate all of our DTR cache components.
$ kubectl create namespace dtr
Create the Kubernetes Secrets, containing the DTR cache TLS certificates, and a Kubernetes ConfigMap containing the DTR cache configuration file.
$ kubectl -n dtr create secret generic dtr-certs \
--from-file=certs/dtr.cert.pem \
--from-file=certs/cache.cert.pem \
--from-file=certs/cache.key.pem
$ kubectl -n dtr create configmap dtr-cache-config \
--from-file=config.yaml
Finally create the Kubernetes Deployment.
$ kubectl create -f dtrcache.yaml
You can check if the deployment has been successful by checking the
running pods in your cluster: kubectl -n dtr get pods
If you need to troubleshoot your deployment, you can use
kubectl -n dtr describe pods <pods>
and / or
kubectl -n dtr logs <pods>
.
For external access to the DTR cache we need to expose the Cache Pods to the outside world. In Kubernetes there are multiple ways for you to expose a service, dependent on your infrastructure and your environment. For more information, see Publishing services - service types on the Kubernetes docs. It is important though that you are consistent in exposing the cache through the same interface you created a certificate for previously. Otherwise the TLS certificate may not be valid through this alternative interface.
DTR Cache Exposure
You only need to expose your DTR cache through one external interface.
The first example exposes the DTR cache through NodePort. In this example you would have added a worker node’s FQDN to the TLS Certificate in step 1. Here you will be accessing the DTR cache through an exposed port on a worker node’s FQDN.
cat > dtrcacheservice.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
name: dtr-cache
namespace: dtr
spec:
type: NodePort
ports:
- name: https
port: 443
targetPort: 443
protocol: TCP
selector:
app: dtr-cache
EOF
kubectl create -f dtrcacheservice.yaml
To find out which port the DTR cache has been exposed on, you will need to run:
$ kubectl -n dtr get services
You can test that your DTR cache is externally reachable by using
curl
to hit the API endpoint, using both a worker node’s external
address, and the NodePort.
curl -X GET https://<workernodefqdn>:<nodeport>/v2/_catalog
{"repositories":[]}
This second example will expose the DTR cache through an ingress object. In this example you will need to create a DNS rule in your environment that will resolve a DTR cache external FQDN address to the address of your ingress controller. You should have also specified the same DTR cache external FQDN address within the DTR cache certificate in step 1.
Note
An ingress controller is a prerequisite for this example. If you have not deployed an ingress controller on your cluster, see Layer 7 Routing for UCP. This ingress controller will also need to support SSL passthrough.
cat > dtrcacheservice.yaml <<EOF
kind: Service
apiVersion: v1
metadata:
name: dtr-cache
namespace: dtr
spec:
selector:
app: dtr-cache
ports:
- protocol: TCP
port: 443
targetPort: 443
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: dtr-cache
namespace: dtr
annotations:
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/secure-backends: "true"
spec:
tls:
- hosts:
- <external-dtr-cache-fqdn> # Replace this value with your external DTR Cache address
rules:
- host: <external-dtr-cache-fqdn> # Replace this value with your external DTR Cache address
http:
paths:
- backend:
serviceName: dtr-cache
servicePort: 443
EOF
kubectl create -f dtrcacheservice.yaml
You can test that your DTR cache is externally reachable by using curl to hit the API endpoint. The address should be the one you have defined above in the serivce definition file.
curl -X GET https://external-dtr-cache-fqdn/v2/_catalog
{"repositories":[]}
If you’re deploying a DTR cache in a zone with few users and with no uptime SLAs, a single cache service is enough for you.
But if you want to make sure your DTR cache is always available to users and is highly performant, you should configure your cache deployment for high availability.
The way you deploy a DTR cache is the same, whether you’re deploying a single replica or multiple ones. The difference is that you should configure the replicas to store data using a shared storage system.
When using a shared storage system, once an image layer is cached, any replica is able to serve it to users without having to fetch a new copy from DTR.
DTR caches support the following storage systems: * Alibaba Cloud Object Storage Service * Amazon S3 * Azure Blob Storage * Google Cloud Storage * NFS * Openstack Swift
If you’re using NFS as a shared storage system, make sure the shared directory is configured with:
/dtr-cache *(rw,root_squash,no_wdelay)
This ensures read-after-write consistency for NFS.
You should also mount the NFS directory on each node where you’ll deploy a DTR cache replica.
Use SSH to log in to a manager node of the swarm where you want to deploy the DTR cache.
If you’re using UCP to manage that swarm you can also use a client bundle to configure your Docker CLI client to connect to that swarm.
Label each node that is going to run the cache replica, by running:
docker node update --label-add dtr.cache=true <node-hostname>
Create the cache configuration files by following the instructions for deploying a single cache replica.
Make sure you adapt the storage
object, using the configuration
options for the shared storage of your choice.
The last step is to deploy a load balancer of your choice to load-balance requests across the multiple replicas you deployed.
DTR caches are based on Docker Registry, and use the same configuration file format.
The DTR cache extends the Docker Registry configuration file format by
introducing a new middleware called downstream
that has three
configuration options: blobttl
, upstreams
, and cas
:
# Settings that you would include in a
# Docker Registry configuration file followed by
middleware:
registry:
- name: downstream
options:
blobttl: 24h
upstreams:
- <Externally-reachable address for upstream registry or content cache in format scheme://host:port>
cas:
- <Absolute path to next-hop upstream registry or content cache CA certificate in the container's filesystem>
Below you can find the description for each parameter, specific to DTR caches.
Parameter | Required | Description |
---|---|---|
blobttl |
No | A positive integer and an optional unit of time suffix to determine the
TTL (Time to Live) value for blobs in the cache. If blobttl is
configured, storage.delete.enabled must be set to true .
Acceptable units of time are:
- ns (nanoseconds)
- us (microseconds)
- ms (milliseconds)
- s (seconds)
- m (minutes)
- h (hours)
If you omit the suffix, the system interprets the value as nanoseconds. |
cas |
No | An optional list of absolute paths to PEM-encoded CA certificates of upstream registries or content caches. |
upstreams |
Yes | A list of externally-reachable addresses for upstream registries of content caches. If more than one host is specified, it will pull from registries in round-robin order. |
You can configure the Docker Trusted Registry (DTR) to automatically delete unused image layers, thus saving you disk space. This process is also known as garbage collection.
First you configure DTR to run a garbage collection job on a fixed schedule. At the scheduled time, DTR:
Starting in DTR 2.5, we introduced an experimental feature which lets you run garbage collection jobs without putting DTR in read-only mode. As of v2.6, online garbage collection is no longer in experimental mode. This means that the registry no longer has to be in read-only mode (or offline) during garbage collection.
In your browser, navigate to https://<dtr-url>
and log in with your
credentials. Select System on the left navigation pane, and then
click the Garbage collection tab to schedule garbage collection.
Select for how long the garbage collection job should run:
If you select Until done or For x minutes, you can specify a recurring schedule in UTC (Coordinated Universal Time) with the following options:
Once everything is configured you can choose to Save & Start to run the garbage collection job immediately, or just Save to run the job on the next scheduled interval.
In v2.5, you were notified with a banner under main navigation that no one can push images while a garbage collection job is running. With v2.6, this is no longer the case since garbage collection now happens while DTR is online and writable.
If you clicked Save & Start previously, verify that the garbage collection routine started by navigating to Job Logs.
Each image stored in DTR is made up of multiple files:
All these files are tracked in DTR’s metadata store in RethinkDB. These files are tracked in a content-addressable way such that a file corresponds to a cryptographic hash of the file’s content. This means that if two image tags hold exactly the same content, DTR only stores the image content once while making hash collisions nearly impossible, even if the tag name is different.
As an example, if wordpress:4.8
and wordpress:latest
have the
same content, the content will only be stored once. If you delete one of
these tags, the other won’t be deleted.
This means that when you delete an image tag, DTR cannot delete the underlying files of that image tag since other tags may also use the same files.
To facilitate online garbage collection, DTR makes a couple of changes to how it uses the storage backend:
To delete unused files, DTR does the following:
By default DTR only allows pushing images if the repository exists, and you have write access to the repository.
As an example, if you try to push to dtr.example.org/library/java:9
,
and the library/java
repository doesn’t exist yet, your push fails.
You can configure DTR to allow pushing to repositories that don’t exist yet. As an administrator, log into the DTR web UI, navigate to the Settings page, and enable Create repository on push.
From now on, when a user pushes to their personal sandbox
(<user-name>/<repository>
), or if the user is an administrator for
the organization (<org>/<repository>
), DTR will create a repository
if it doesn’t exist yet. In that case, the repository is created as
private.
curl --user <admin-user>:<password> \
--request POST "<dtr-url>/api/v0/meta/settings" \
--header "accept: application/json" \
--header "content-type: application/json" \
--data "{ \"createRepositoryOnPush\": true}"
Docker Trusted Registry makes outgoing connections to check for new versions, automatically renew its license, and update its vulnerability database. If DTR can’t access the internet, then you’ll have to manually apply updates.
One option to keep your environment secure while still allowing DTR access to the internet is to use a web proxy. If you have an HTTP or HTTPS proxy, you can configure DTR to use it. To avoid downtime you should do this configuration outside business peak hours.
As an administrator, log into a node where DTR is deployed, and run:
docker run -it --rm \
docker/dtr:2.7.5 reconfigure \
--http-proxy http://<domain>:<port> \
--https-proxy https://<doman>:<port> \
--ucp-insecure-tls
To confirm how DTR is configured, check the Settings page on the web UI.
If by chance the web proxy requires authentication you can submit the username and password, in the command, as shown below:
docker run -it --rm \
docker/dtr:2.7.5 reconfigure \
--http-proxy username:password@<domain>:<port> \
--https-proxy username:password@<doman>:<port> \
--ucp-insecure-tls
Note
DTR will hide the password portion of the URL, when it is displayed in the DTR UI.
With DTR you get to control which users have access to your image repositories.
By default, anonymous users can only pull images from public repositories. They can’t create new repositories or push to existing ones. You can then grant permissions to enforce fine-grained access control to image repositories. For that:
Start by creating a user.
Users are shared across UCP and DTR. When you create a new user in Docker Universal Control Plane, that user becomes available in DTR and vice versa. Registered users can create and manage their own repositories.
You can also integrate with an LDAP service to manage users from a single place.
Extend the permissions by adding the user to a team.
To extend a user’s permission and manage their permissions over repositories, you add the user to a team. A team defines the permissions users have for a set of repositories.
When a user creates a repository, only that user can make changes to the repository settings, and push new images to it.
Organizations take permission management one step further, since they allow multiple users to own and manage a common set of repositories. This is useful when implementing team workflows. With organizations you can delegate the management of a set of repositories and user permissions to the organization administrators.
An organization owns a set of repositories, and defines a set of teams. With teams you can define fine-grain permissions that a team of user has for a set of repositories.
In this example, the ‘Whale’ organization has three repositories and two teams:
When using the built-in authentication, you can create users to grant them fine-grained permissions.
Users are shared across UCP and DTR. When you create a new user in Docker Universal Control Plane, that user becomes available in DTR and vice versa.
To create a new user, go to the DTR web UI, and navigate to the Users page.
Click the New user button, and fill-in the user information.
Check the Trusted Registry admin option, if you want to grant permissions for the user to be a UCP and DTR administrator.
You can extend a user’s default permissions by granting them individual permissions in other image repositories, by adding the user to a team. A team defines the permissions a set of users have for a set of repositories.
To create a new team, go to the DTR web UI, and navigate to the Organizations page. Then click the organization where you want to create the team. In this example, we’ll create the ‘billing’ team under the ‘whale’ organization.
Click ‘+’ to create a new team, and give it a name.
Once you have created a team, click the team name, to manage its settings. The first thing we need to do is add users to the team. Click the Add user button and add users to the team.
The next step is to define the permissions this team has for a set of repositories. Navigate to the Repositories tab, and click the Add repository button.
Choose the repositories this team has access to, and what permission levels the team members have.
There are three permission levels available:
Permission level | Description |
---|---|
Read only | View repository and pull images. |
Read & Write | View repository, pull and push images. |
Admin | Manage repository and change its settings, pull and push images. |
If you’re an organization owner, you can delete a team in that organization. Navigate to the Team, choose the Settings tab, and click Delete.
When a user creates a repository, only that user has permissions to make changes to the repository.
For team workflows, where multiple users have permissions to manage a set of common repositories, create an organization. By default, DTR has one organization called ‘docker-datacenter’, that is shared between DTR and UCP.
To create a new organization, navigate to the DTR web UI, and go to the Organizations page.
Click the New organization button, and choose a meaningful name for the organization.
Repositories owned by this organization will contain the organization name, so to pull an image from that repository, you’ll use:
docker pull <dtr-domain-name>/<organization>/<repository>:<tag>
Click Save to create the organization, and then click the organization to define which users are allowed to manage this organization. These users will be able to edit the organization settings, edit all repositories owned by the organization, and define the user permissions for this organization.
For this, click the Add user button, select the users that you want to grant permissions to manage the organization, and click Save. Then change their permissions from ‘Member’ to Org Owner.
Docker Trusted Registry allows you to define fine-grain permissions over image repositories.
Users are shared across UCP and DTR. When you create a new user in Docker Universal Control Plane, that user becomes available in DTR and vice versa. When you create a trusted admin in DTR, the admin has permissions to manage:
Teams allow you to define the permissions a set of user has for a set of repositories. Three permission levels are available:
Repository operation | read | read-write | admin |
---|---|---|---|
View/ browse | x | x | x |
Pull | x | x | x |
Push | x | x | |
Start a scan | x | x | |
Delete tags | x | x | |
Edit description | x | ||
Set public or private | x | ||
Manage user access | x | ||
Delete repository | x |
Team permissions are additive. When a user is a member of multiple teams, they have the highest permission level defined by those teams.
Here’s an overview of the permission levels available in DTR:
You can configure DTR to automatically post event notifications to a webhook URL of your choosing. This lets you build complex CI and CD pipelines with your Docker images. The following is a complete list of event types you can trigger webhook notifications for via the web interface or the API.
Event Type | Scope | Access Level | Availability |
---|---|---|---|
Tag pushed to repository
(TAG_PUSH ) |
Individual repositories | Repository admin | Web UI & API |
Tag pulled from repository
(TAG_PULL ) |
Individual repositories | Repository admin | Web UI & API |
Tag deleted from repository
(TAG_DELETE ) |
Individual repositories | Repository admin | Web UI & API |
Manifest pushed to repository | Individual repositories | Repository admin | Web UI & API |
Manifest pulled from
repository
(MANIFEST_PULL ) |
Individual repositories | Repository admin | Web UI & API |
Manifest deleted from
repository
(MANIFEST_DELETE ) |
Individual repositories | Repository admin | Web UI & API |
Security scan completed
(SCAN_COMPLETED ) |
Individual repositories | Repository admin | Web UI & API |
Security scan failed
(SCAN_FAILED ) |
Individual repositories | Repository admin | Web UI & API |
Image promoted from
repository (PROMOTION ) |
Individual repositories | Repository admin | Web UI & API |
Image mirrored from
repository
(PUSH_MIRRORING ) |
Individual repositories | Repository admin | Web UI & API |
Image mirrored from remote
repository
(POLL_MIRRORING ) |
Individual repositories | Repository admin | Web UI & API |
Repository created, updated,
or deleted
(REPO_CREATED ,
REPO_UPDATED , and
REPO_DELETED ) |
Namespaces / Organizations | Namespace / Org owners | API Only |
Security scanner update completed (` SCANNER_UPDATE_COMPLETED`) | Global | DTR admin | API Only |
You must have admin privileges to a repository or namespace in order to subscribe to its webhook events. For example, a user must be an admin of repository “foo/bar” to subscribe to its tag push events. A DTR admin can subscribe to any event.
In your browser, navigate to https://<dtr-url>
and log in with
your credentials.
Select Repositories from the left navigation pane, and then click
on the name of the repository that you want to view. Note that you
will have to click on the repository name following the /
after
the specific namespace for your repository.
Select the Webhooks tab, and click New Webhook.
From the drop-down list, select the event that will trigger the webhook.
Set the URL which will receive the JSON payload. Click Test next to the Webhook URL field, so that you can validate that the integration is working. At your specified URL, you should receive a JSON payload for your chosen event type notification.
{
"type": "TAG_PUSH",
"createdAt": "2019-05-15T19:39:40.607337713Z",
"contents": {
"namespace": "foo",
"repository": "bar",
"tag": "latest",
"digest": "sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c",
"imageName": "foo/bar:latest",
"os": "linux",
"architecture": "amd64",
"author": "",
"pushedAt": "2015-01-02T15:04:05Z"
},
"location": "/repositories/foo/bar/tags/latest"
}
Expand “Show advanced settings” to paste the TLS certificate associated with your webhook URL. For testing purposes, you can test over HTTP instead of HTTPS.
Click Create to save. Once saved, your webhook is active and starts sending POST notifications whenever your chosen event type is triggered.
As a repository admin, you can add or delete a webhook at any point. Additionally, you can create, view, and delete webhooks for your organization or trusted registry using the API.
See Webhook types for a list of events you can trigger notifications for via the API.
Your DTR hostname serves as the base URL for your API requests.
From the DTR web interface, click API on the bottom left navigation pane to explore the API resources and endpoints. Click Execute to send your API request.
You can use curl to send
HTTP or HTTPS API requests. Note that you will have to specify
skipTLSVerification: true
on your request in order to test the
webhook endpoint over HTTP.
curl -u test-user:$TOKEN -X POST "https://dtr-example.com/api/v0/webhooks" -H "accept: application/json" -H "content-type: application/json" -d "{ \"endpoint\": \"https://webhook.site/441b1584-949d-4608-a7f3-f240bdd31019\", \"key\": \"maria-testorg/lab-words\", \"skipTLSVerification\": true, \"type\": \"TAG_PULL\"}"
{
"id": "b7bf702c31601efb4796da59900ddc1b7c72eb8ca80fdfb1b9fecdbad5418155",
"type": "TAG_PULL",
"key": "maria-testorg/lab-words",
"endpoint": "https://webhook.site/441b1584-949d-4608-a7f3-f240bdd31019",
"authorID": "194efd8e-9ee6-4d43-a34b-eefd9ce39087",
"createdAt": "2019-05-22T01:55:20.471286995Z",
"lastSuccessfulAt": "0001-01-01T00:00:00Z",
"inactive": false,
"tlsCert": "",
"skipTLSVerification": true
}
To subscribe to events, send a POST
request to /api/v0/webhooks
with the following JSON payload:
{
"type": "TAG_PUSH",
"key": "foo/bar",
"endpoint": "https://example.com"
}
The keys in the payload are:
type
: The event type to subcribe to.key
: The namespace/organization or repo to subscribe to. For
example, “foo/bar” to subscribe to pushes to the “bar” repository
within the namespace/organization “foo”.endpoint
: The URL to send the JSON payload to.Normal users must supply a “key” to scope a particular webhook event to a repository or a namespace/organization. DTR admins can choose to omit this, meaning a POST event notification of your specified type will be sent for all DTR repositories and namespaces.
Whenever your specified event type occurs, DTR will send a POST request to the given endpoint with a JSON-encoded payload. The payload will always have the following wrapper:
{
"type": "...",
"createdAt": "2012-04-23T18:25:43.511Z",
"contents": {...}
}
type
refers to the event type received at the specified
subscription endpoint.contents
refers to the payload of the event itself. Each event is
different, therefore the structure of the JSON object in contents
will change depending on the event type. See Content
structure for more details.Before subscribing to an event, you can view and test your endpoints
using fake data. To send a test payload, send POST
request to
/api/v0/webhooks/test
with the following payload:
{
"type": "...",
"endpoint": "https://www.example.com/"
}
Change type
to the event type that you want to receive. DTR will
then send an example payload to your specified endpoint. The example
payload sent is always the same.
Comments after (//
) are for informational purposes only, and the
example payloads have been clipped for brevity.
Tag push
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"tag": "", // (string) the name of the tag just pushed
"digest": "", // (string) sha256 digest of the manifest the tag points to (eg. "sha256:0afb...")
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar:tag)
"os": "", // (string) the OS for the tag's manifest
"architecture": "", // (string) the architecture for the tag's manifest
"author": "", // (string) the username of the person who pushed the tag
"pushedAt": "", // (string) JSON-encoded timestamp of when the push occurred
...
}
Tag delete
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"tag": "", // (string) the name of the tag just deleted
"digest": "", // (string) sha256 digest of the manifest the tag points to (eg. "sha256:0afb...")
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar:tag)
"os": "", // (string) the OS for the tag's manifest
"architecture": "", // (string) the architecture for the tag's manifest
"author": "", // (string) the username of the person who deleted the tag
"deletedAt": "", // (string) JSON-encoded timestamp of when the delete occurred
...
}
Manifest push
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"digest": "", // (string) sha256 digest of the manifest (eg. "sha256:0afb...")
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar@sha256:0afb...)
"os": "", // (string) the OS for the manifest
"architecture": "", // (string) the architecture for the manifest
"author": "", // (string) the username of the person who pushed the manifest
...
}
Manifest delete
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"digest": "", // (string) sha256 digest of the manifest (eg. "sha256:0afb...")
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar@sha256:0afb...)
"os": "", // (string) the OS for the manifest
"architecture": "", // (string) the architecture for the manifest
"author": "", // (string) the username of the person who deleted the manifest
"deletedAt": "", // (string) JSON-encoded timestamp of when the delete occurred
...
}
Security scan completed
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"tag": "", // (string) the name of the tag scanned
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar:tag)
"scanSummary": {
"namespace": "", // (string) repository's namespace/organization name
"repository": "", // (string) repository name
"tag": "", // (string) the name of the tag just pushed
"critical": 0, // (int) number of critical issues, where CVSS >= 7.0
"major": 0, // (int) number of major issues, where CVSS >= 4.0 && CVSS < 7
"minor": 0, // (int) number of minor issues, where CVSS > 0 && CVSS < 4.0
"last_scan_status": 0, // (int) enum; see scan status section
"check_completed_at": "", // (string) JSON-encoded timestamp of when the scan completed
...
}
}
Security scan failed
{
"namespace": "", // (string) namespace/organization for the repository
"repository": "", // (string) repository name
"tag": "", // (string) the name of the tag scanned
"imageName": "", // (string) the fully-qualified image name including DTR host used to pull the image (eg. 10.10.10.1/foo/bar@sha256:0afb...)
"error": "", // (string) the error that occurred while scanning
...
}
Repository event (created/updated/deleted)
{
"namespace": "", // (string) repository's namespace/organization name
"repository": "", // (string) repository name
"event": "", // (string) enum: "REPO_CREATED", "REPO_DELETED" or "REPO_UPDATED"
"author": "", // (string) the name of the user responsible for the event
"data": {} // (object) when updating or creating a repo this follows the same format as an API response from /api/v0/repositories/{namespace}/{repository}
}
Security scanner update complete
{
"scanner_version": "",
"scanner_updated_at": "", // (string) JSON-encoded timestamp of when the scanner updated
"db_version": 0, // (int) newly updated database version
"db_updated_at": "", // (string) JSON-encoded timestamp of when the database updated
"success": <true|false> // (bool) whether the update was successful
"replicas": { // (object) a map keyed by replica ID containing update information for each replica
"replica_id": {
"db_updated_at": "", // (string) JSON-encoded time of when the replica updated
"version": "", // (string) version updated to
"replica_id": "" // (string) replica ID
},
...
}
}
To view existing subscriptions, send a GET
request to
/api/v0/webhooks
. As a normal user (i.e., not a DTR admin), this will
show all of your current subscriptions across every
namespace/organization and repository. As a DTR admin, this will show
every webhook configured for your DTR.
The API response will be in the following format:
[
{
"id": "", // (string): UUID of the webhook subscription
"type": "", // (string): webhook event type
"key": "", // (string): the individual resource this subscription is scoped to
"endpoint": "", // (string): the endpoint to send POST event notifications to
"authorID": "", // (string): the user ID resposible for creating the subscription
"createdAt": "", // (string): JSON-encoded datetime when the subscription was created
},
...
]
For more information, view the API documentation.
You can also view subscriptions for a given resource that you are an admin of. For example, if you have admin rights to the repository “foo/bar”, you can view all subscriptions (even other people’s) from a particular API endpoint. These endpoints are:
GET /api/v0/repositories/{namespace}/{repository}/webhooks
: View
all webhook subscriptions for a repositoryGET /api/v0/repositories/{namespace}/webhooks
: View all webhook
subscriptions for a namespace/organizationTo delete a webhook subscription, send a DELETE
request to
/api/v0/webhooks/{id}
, replacing {id}
with the webhook
subscription ID which you would like to delete.
Only a DTR admin or an admin for the resource with the event subscription can delete a subscription. As a normal user, you can only delete subscriptions for repositories which you manage.
Docker Trusted Registry (DTR) uses a job queue to schedule batch jobs. Jobs are added to a cluster-wide job queue, and then consumed and executed by a job runner within DTR.
All DTR replicas have access to the job queue, and have a job runner component that can get and execute work.
When a job is created, it is added to a cluster-wide job queue and
enters the waiting
state. When one of the DTR replicas is ready to
claim the job, it waits a random time of up to 3
seconds to give
every replica the opportunity to claim the task.
A replica claims a job by adding its replica ID to the job. That way,
other replicas will know the job has been claimed. Once a replica claims
a job, it adds that job to an internal queue, which in turn sorts the
jobs by their scheduledAt
time. Once that happens, the replica
updates the job status to running
, and starts executing it.
The job runner component of each DTR replica keeps a
heartbeatExpiration
entry on the database that is shared by all
replicas. If a replica becomes unhealthy, other replicas notice the
change and update the status of the failing worker to dead
. Also,
all the jobs that were claimed by the unhealthy replica enter the
worker_dead
state, so that other replicas can claim the job.
DTR runs periodic and long-running jobs. The following is a complete list of jobs you can filter for via the user interface or the API.
Job | Description |
---|---|
gc | A garbage collection job that deletes layers associated with deleted images. |
onlinegc | A garbage collection job that deletes layers associated with deleted images without putting the registry in read-only mode. |
onlinegc_metadata | A garbage collection job that deletes metadata associated with deleted images. |
onlinegc_joblogs | A garbage collection job that deletes job logs based on a configured job history setting. |
metadatastoremigration | A necessary migration that enables the onlinegc feature. |
sleep | Used for testing the correctness of the jobrunner. It sleeps for 60 seconds. |
false | Used for testing the correctness of the jobrunner. It runs the false
command and immediately fails. |
tagmigration | Used for synchronizing tag and manifest information between the DTR database and the storage backend. |
bloblinkmigration | A DTR 2.1 to 2.2 upgrade process that adds references for blobs to repositories in the database. |
license_update | Checks for license expiration extensions if online license updates are enabled. |
scan_check | An image security scanning job. This job does not perform the actual
scanning, rather it spawns scan_check_single jobs (one for each
layer in the image). Once all of the scan_check_single jobs are
complete, this job will terminate. |
scan_check_single | A security scanning job for a particular layer given by the
parameter: SHA256SUM . This job breaks up the layer into components
and checks each component for vulnerabilities. |
scan_check_all | A security scanning job that updates all of the currently scanned images to display the latest vulnerabilities. |
update_vuln_db | A job that is created to update DTR’s vulnerability database. It uses an
Internet connection to check for database updates through
https://dss-cve-updates.docker.com/ and updates the
dtr-scanningstore container if there is a new update available. |
scannedlayermigration | A DTR 2.4 to 2.5 upgrade process that restructures scanned image data. |
push_mirror_tag | A job that pushes a tag to another registry after a push mirror policy has been evaluated. |
poll_mirror | A global cron that evaluates poll mirroring policies. |
webhook | A job that is used to dispatch a webhook payload to a single endpoint. |
nautilus_update_db | The old name for the update_vuln_db job. This may be visible on old log files. |
ro_registry | A user-initiated job for manually switching DTR into read-only mode. |
tag_pruning | A job for cleaning up unnecessary or unwanted repository tags which can be configured by repository admins. For configuration options, see Tag Pruning. |
Jobs can have one of the following status values:
Status | Description |
---|---|
waiting | Unclaimed job waiting to be picked up by a worker. |
running | The job is currently being run by the specified workerID . |
done | The job has successfully completed. |
error | The job has completed with errors. |
cancel_request | The status of a job is monitored by the worker in the database. If the
job status changes to cancel_request , the job is canceled by the worker. |
cancel | The job has been canceled and was not fully executed. |
deleted | The job and its logs have been removed. |
worker_dead | The worker for this job has been declared dead and the job will not
continue. |
worker_shutdown | The worker that was running this job has been gracefully stopped. |
worker_resurrection | The worker for this job has reconnected to the database and will cancel this job. |
As of DTR 2.2, admins were able to view and audit jobs within DTR using the API. DTR 2.6 enhances those capabilities by adding a Job Logs tab under System settings on the user interface. The tab displays a sortable and paginated list of jobs along with links to associated job logs.
To view the list of jobs within DTR, do the following:
Navigate to https://<dtr-url>
and log in with your UCP
credentials.
Select System from the left navigation pane, and then click Job
Logs. You should see a paginated list of past, running, and queued
jobs. By default, Job Logs shows the latest 10
jobs on the
first page.
Specify a filtering option. Job Logs lets you filter by:
Action: See Audit Jobs via the API: Job Types for an explanation on the different actions or job types.
Worker ID: The ID of the worker in a DTR replica that is responsible for running the job.
Optional: Click Edit Settings on the right of the filtering options to update your Job Logs settings. See Enable auto-deletion of job logs for more details.
The following is an explanation of the job-related fields displayed in
Job Logs and uses the filtered online_gc
action from above.
Job | Description | Example |
---|---|---|
Action | The type of action or job being performed. See Job Types for a full list of job types. | onlinegc |
ID | The ID of the job. | ccc05646-569a-4ac4-b8e1-113111f63fb9 |
Worker | The ID of the worker node responsible for running the job. | 8f553c8b697c |
Status | Current status of the action or job. See Job Status for more details. | done |
Start Time | Time when the job started. | 9/23/2018 7:04 PM |
Last Updated | Time when the job was last updated. | 9/23/2018 7:04 PM |
View Logs | Links to the full logs for the job. | [View Logs] |
To view the log details for a specific job, do the following:
Click View Logs next to the job’s Last Updated value. You will be redirected to the log detail page of your selected job.
Notice how the job ID
is reflected in the URL while the Action
and
the abbreviated form of the job ID
are reflected in the heading. Also,
the JSON lines displayed are job-specific DTR container logs.
See DTR Internal Components for more
details.
Enter or select a different line count to truncate the number of lines displayed. Lines are cut off from the end of the logs.
This covers troubleshooting batch jobs via the API and was introduced in DTR 2.2. Starting in DTR 2.6, admins have the ability to audit jobs using the web interface.
Each job runner has a limited capacity and will not claim jobs that
require a higher capacity. You can see the capacity of a job runner via
the GET /api/v0/workers
endpoint:
{
"workers": [
{
"id": "000000000000",
"status": "running",
"capacityMap": {
"scan": 1,
"scanCheck": 1
},
"heartbeatExpiration": "2017-02-18T00:51:02Z"
}
]
}
This means that the worker with replica ID 000000000000
has a
capacity of 1 scan
and 1 scanCheck
. Next, review the list of
available jobs:
{
"jobs": [
{
"id": "0",
"workerID": "",
"status": "waiting",
"capacityMap": {
"scan": 1
}
},
{
"id": "1",
"workerID": "",
"status": "waiting",
"capacityMap": {
"scan": 1
}
},
{
"id": "2",
"workerID": "",
"status": "waiting",
"capacityMap": {
"scanCheck": 1
}
}
]
}
If worker 000000000000
notices the jobs in waiting
state above,
then it will be able to pick up jobs 0
and 2
since it has the
capacity for both. Job 1
will have to wait until the previous scan
job, 0
, is completed. The job queue will then look like:
{
"jobs": [
{
"id": "0",
"workerID": "000000000000",
"status": "running",
"capacityMap": {
"scan": 1
}
},
{
"id": "1",
"workerID": "",
"status": "waiting",
"capacityMap": {
"scan": 1
}
},
{
"id": "2",
"workerID": "000000000000",
"status": "running",
"capacityMap": {
"scanCheck": 1
}
}
]
}
You can get a list of jobs via the GET /api/v0/jobs/
endpoint. Each
job looks like:
{
"id": "1fcf4c0f-ff3b-471a-8839-5dcb631b2f7b",
"retryFromID": "1fcf4c0f-ff3b-471a-8839-5dcb631b2f7b",
"workerID": "000000000000",
"status": "done",
"scheduledAt": "2017-02-17T01:09:47.771Z",
"lastUpdated": "2017-02-17T01:10:14.117Z",
"action": "scan_check_single",
"retriesLeft": 0,
"retriesTotal": 0,
"capacityMap": {
"scan": 1
},
"parameters": {
"SHA256SUM": "1bacd3c8ccb1f15609a10bd4a403831d0ec0b354438ddbf644c95c5d54f8eb13"
},
"deadline": "",
"stopTimeout": ""
}
The JSON fields of interest here are:
id
: The ID of the jobworkerID
: The ID of the worker in a DTR replica that is running
this jobstatus
: The current state of the jobaction
: The type of job the worker will actually performcapacityMap
: The available capacity a worker needs for this job
to runSeveral of the jobs performed by DTR are run in a recurrent schedule.
You can see those jobs using the GET /api/v0/crons
endpoint:
{
"crons": [
{
"id": "48875b1b-5006-48f5-9f3c-af9fbdd82255",
"action": "license_update",
"schedule": "57 54 3 * * *",
"retries": 2,
"capacityMap": null,
"parameters": null,
"deadline": "",
"stopTimeout": "",
"nextRun": "2017-02-22T03:54:57Z"
},
{
"id": "b1c1e61e-1e74-4677-8e4a-2a7dacefffdc",
"action": "update_db",
"schedule": "0 0 3 * * *",
"retries": 0,
"capacityMap": null,
"parameters": null,
"deadline": "",
"stopTimeout": "",
"nextRun": "2017-02-22T03:00:00Z"
}
]
}
The schedule
field uses a cron expression following the
(seconds) (minutes) (hours) (day of month) (month) (day of week)
format. For example, 57 54 3 * * *
with cron ID
48875b1b-5006-48f5-9f3c-af9fbdd82255
will be run at 03:54:57
on
any day of the week or the month, which is 2017-02-22T03:54:57Z
in
the example JSON response above.
Docker Trusted Registry has a global setting for auto-deletion of job logs which allows them to be removed as part of garbage collection. DTR admins can enable auto-deletion of repository events in DTR 2.6 based on specified conditions which are covered below.
In your browser, navigate to https://<dtr-url>
and log in with
your UCP credentials.
Select System on the left navigation pane which will display the Settings page by default.
Scroll down to Job Logs and turn on Auto-Deletion.
Specify the conditions with which a job log auto-deletion will be triggered.
DTR allows you to set your auto-deletion conditions based on the following optional job log attributes:
Name | Description | Example |
---|---|---|
Age | Lets you remove job logs which are older than your specified number of hours, days, weeks or months | 2 months |
Max number of events | Lets you specify the maximum number of job logs allowed within DTR. | 100 |
If you check and specify both, job logs will be removed from DTR during garbage collection if either condition is met. You should see a confirmation message right away.
Click Start Deletion if you’re ready. Read more about garbage collection if you’re unsure about this operation.
Navigate to System > Job Logs to confirm that onlinegc_joblogs has started. For a detailed breakdown of individual job logs, see View Job-specific Logs in “Audit Jobs via the Web Interface.”
Docker Trusted Registry is a Dockerized application. To monitor it, you can use the same tools and techniques you’re already using to monitor other containerized applications running on your cluster. One way to monitor DTR is using the monitoring capabilities of Docker Universal Control Plane.
In your browser, log in to Docker Universal Control Plane (UCP), and navigate to the Stacks page. If you have DTR set up for high-availability, then all the DTR replicas are displayed.
To check the containers for the DTR replica, click the replica you want to inspect, click Inspect Resource, and choose Containers.
Now you can drill into each DTR container to see its logs and find the root cause of the problem.
DTR also exposes several endpoints you can use to assess if a DTR replica is healthy or not:
/_ping
: Checks if the DTR replica is healthy, and returns a
simple json response. This is useful for load balancing or other
automated health check tasks./nginx_status
: Returns the number of connections being handled by
the NGINX front-end used by DTR./api/v0/meta/cluster_status
: Returns extensive information about
all DTR replicas.The /api/v0/meta/cluster_status
endpoint requires administrator
credentials, and returns a JSON object for the entire cluster as observed by
the replica being queried. You can authenticate your requests using HTTP basic
auth.
curl -ksL -u <user>:<pass> https://<dtr-domain>/api/v0/meta/cluster_status
{
"current_issues": [
{
"critical": false,
"description": "... some replicas are not ready. The following servers are
not reachable: dtr_rethinkdb_f2277ad178f7",
}],
"replica_health": {
"f2277ad178f7": "OK",
"f3712d9c419a": "OK",
"f58cf364e3df": "OK"
},
}
You can find health status on the current_issues
and
replica_health
arrays. If this endpoint doesn’t provide meaningful
information when trying to troubleshoot, try troubleshooting using
logs.
Docker Content Trust (DCT) keeps audit logs of changes made to trusted repositories. Every time you push a signed image to a repository, or delete trust data for a repository, DCT logs that information.
These logs are only available from the DTR API.
To access the audit logs you need to authenticate your requests using an authentication token. You can get an authentication token for all repositories, or one that is specific to a single repository.
DTR returns a JSON file with a token, even when the user doesn’t have access to the repository to which they requested the authentication token. This token doesn’t grant access to DTR repositories.
The JSON file returned has the following structure:
{
"token": "<token>",
"access_token": "<token>",
"expires_in": "<expiration in seconds>",
"issued_at": "<time>"
}
Once you have an authentication token you can use the following endpoints to get audit logs:
URL | Description | Authorization |
---|---|---|
GET /v2/_trust/changefeed |
Get audit logs for all repositories. | Global scope token |
GET /v2/<dtr-url>/<repository>/_trust/changefeed |
Get audit logs for a specific repository. | Repository-specific token |
Both endpoints have the following query string parameters:
Field name | Required | Type | Description |
---|---|---|---|
change_id |
Yes | String | A non-inclusive starting change ID from which to start returning results. This will typically be the first or last change ID from the previous page of records requested, depending on which direction your are paging in. The value The value |
records |
Yes | Signed integer | The number of records to return. A negative value indicates the number
of records preceding the change_id should be returned. Records are
always returned sorted from oldest to newest. |
The response is a JSON like:
{
"count": 1,
"records": [
{
"ID": "0a60ec31-d2aa-4565-9b74-4171a5083bef",
"CreatedAt": "2017-11-06T18:45:58.428Z",
"GUN": "dtr.example.org/library/wordpress",
"Version": 1,
"SHA256": "a4ffcae03710ae61f6d15d20ed5e3f3a6a91ebfd2a4ba7f31fc6308ec6cc3e3d",
"Category": "update"
}
]
}
Below is the description for each of the fields in the response:
count |
The number of records returned. |
---|---|
ID |
The ID of the change record. Should be used in the change_id field
of requests to provide a non-exclusive starting index. It should be
treated as an opaque value that is guaranteed to be unique within an
instance of notary. |
CreatedAt |
The time the change happened. |
GUN |
The DTR repository that was changed. |
Version |
The version that the repository was updated to. This increments every time there’s a change to the trust repository. This is always |
SHA256 |
The checksum of the timestamp being updated to. This can be used with the existing notary APIs to request said timestamp. This is always an empty string for events representing trusted data being removed from the repository |
Category |
The kind of change that was made to the trusted repository. Can be
update , or deletion . |
The results only include audit logs for events that happened more than 60 seconds ago, and are sorted from oldest to newest.
Even though the authentication API always returns a token, the changefeed API validates if the user has access to see the audit logs or not:
Before going through this example, make sure that you:
library/wordpress
repository.jq
, to make it easier to parse the JSON responses.# Pull an image from Docker Hub
docker pull wordpress:latest
# Tag that image
docker tag wordpress:latest <dtr-url>/library/wordpress:1
# Log into DTR
docker login <dtr-url>
# Push the image to DTR and sign it
DOCKER_CONTENT_TRUST=1 docker push <dtr-url>/library/wordpress:1
# Get global-scope authorization token, and store it in TOKEN
export TOKEN=$(curl --insecure --silent \
--user '<user>:<password>' \
'https://<dtr-url>/auth/token?realm=dtr&service=dtr&scope=registry:catalog:*' | jq --raw-output .token)
# Get audit logs for all repositories and pretty-print it
# If you pushed the image less than 60 seconds ago, it's possible
# That DTR doesn't show any events. Retry the command after a while.
curl --insecure --silent \
--header "Authorization: Bearer $TOKEN" \
"https://<dtr-url>/v2/_trust/changefeed?records=10&change_id=0" | jq .
Before going through this example, make sure that you:
library/nginx
repository.jq
, to make it easier to parse the JSON responses.# Pull an image from Docker Hub
docker pull nginx:latest
# Tag that image
docker tag nginx:latest <dtr-url>/library/nginx:1
# Log into DTR
docker login <dtr-url>
# Push the image to DTR and sign it
DOCKER_CONTENT_TRUST=1 docker push <dtr-url>/library/nginx:1
# Get global-scope authorization token, and store it in TOKEN
export TOKEN=$(curl --insecure --silent \
--user '<user>:<password>' \
'https://<dtr-url>/auth/token?realm=dtr&service=dtr&scope=repository:<dtr-url>/<repository>:pull' | jq --raw-output .token)
# Get audit logs for all repositories and pretty-print it
# If you pushed the image less than 60 seconds ago, it's possible that
# Docker Content Trust won't show any events. Retry the command after a while.
curl --insecure --silent \
--header "Authorization: Bearer $TOKEN" \
"https://<dtr-url>/v2/<dtr-url>/<dtr-repo>/_trust/changefeed?records=10&change_id=0" | jq .
This guide contains tips and tricks for troubleshooting DTR problems.
High availability in DTR depends on swarm overlay networking. One way to test if overlay networks are working correctly is to deploy containers to the same overlay network on different nodes and see if they can ping one another.
Use SSH to log into a node and run:
docker run -it --rm \
--net dtr-ol --name overlay-test1 \
--entrypoint sh docker/dtr
Then use SSH to log into another node and run:
docker run -it --rm \
--net dtr-ol --name overlay-test2 \
--entrypoint ping docker/dtr -c 3 overlay-test1
If the second command succeeds, it indicates overlay networking is working correctly between those nodes.
You can run this test with any attachable overlay network and any Docker
image that has sh
and ping
.
DTR uses RethinkDB for persisting data and replicating it across replicas. It might be helpful to connect directly to the RethinkDB instance running on a DTR replica to check the DTR internal state.
Warning
Modifying RethinkDB directly is not supported and may cause problems.
As of v2.5.5, the RethinkCLI has been removed from the
RethinkDB image along with other unused components. You can now run RethinkCLI
from a separate image in the dockerhubenterprise
organization. Note that
the commands below are using separate tags for non-interactive and interactive
modes.
Use SSH to log into a node that is running a DTR replica, and run the following:
# List problems in the cluster detected by the current node.
REPLICA_ID=$(docker container ls --filter=name=dtr-rethink --format '{{.Names}}' | cut -d'/' -f2 | cut -d'-' -f3 | head -n 1) && echo 'r.db("rethinkdb").table("current_issues")' | docker run --rm -i --net dtr-ol -v "dtr-ca-${REPLICA_ID}:/ca" -e DTR_REPLICA_ID=$REPLICA_ID dockerhubenterprise/rethinkcli:v2.2.0-ni non-interactive
On a healthy cluster the output will be []
.
Starting in DTR 2.5.5, you can run RethinkCLI from a separate image. First, set an environment variable for your DTR replica ID:
REPLICA_ID=$(docker inspect -f '{{.Name}}' $(docker ps -q -f name=dtr-rethink) | cut -f 3 -d '-')
RethinkDB stores data in different databases that contain multiple tables. Run the following command to get into interactive mode and query the contents of the DB:
docker run -it --rm --net dtr-ol -v dtr-ca-$REPLICA_ID:/ca dockerhubenterprise/rethinkcli:v2.3.0 $REPLICA_ID
# List problems in the cluster detected by the current node.
> r.db("rethinkdb").table("current_issues")
[]
# List all the DBs in RethinkDB
> r.dbList()
[ 'dtr2',
'jobrunner',
'notaryserver',
'notarysigner',
'rethinkdb' ]
# List the tables in the dtr2 db
> r.db('dtr2').tableList()
[ 'blob_links',
'blobs',
'client_tokens',
'content_caches',
'events',
'layer_vuln_overrides',
'manifests',
'metrics',
'namespace_team_access',
'poll_mirroring_policies',
'promotion_policies',
'properties',
'pruning_policies',
'push_mirroring_policies',
'repositories',
'repository_team_access',
'scanned_images',
'scanned_layers',
'tags',
'user_settings',
'webhooks' ]
# List the entries in the repositories table
> r.db('dtr2').table('repositories')
[ { enableManifestLists: false,
id: 'ac9614a8-36f4-4933-91fa-3ffed2bd259b',
immutableTags: false,
name: 'test-repo-1',
namespaceAccountID: 'fc3b4aec-74a3-4ba2-8e62-daed0d1f7481',
namespaceName: 'admin',
pk: '3a4a79476d76698255ab505fb77c043655c599d1f5b985f859958ab72a4099d6',
pulls: 0,
pushes: 0,
scanOnPush: false,
tagLimit: 0,
visibility: 'public' },
{ enableManifestLists: false,
id: '9f43f029-9683-459f-97d9-665ab3ac1fda',
immutableTags: false,
longDescription: '',
name: 'testing',
namespaceAccountID: 'fc3b4aec-74a3-4ba2-8e62-daed0d1f7481',
namespaceName: 'admin',
pk: '6dd09ac485749619becaff1c17702ada23568ebe0a40bb74a330d058a757e0be',
pulls: 0,
pushes: 0,
scanOnPush: false,
shortDescription: '',
tagLimit: 1,
visibility: 'public' } ]
Individual DBs and tables are a private implementation detail and may
change in DTR from version to version, but you can always use
dbList()
and tableList()
to explore the contents and data
structure.
To check on the overall status of your DTR cluster without interacting with RethinkCLI, run the following API request:
curl -u admin:$TOKEN -X GET "https://<dtr-url>/api/v0/meta/cluster_status" -H "accept: application/json"
{
"rethink_system_tables": {
"cluster_config": [
{
"heartbeat_timeout_secs": 10,
"id": "heartbeat"
}
],
"current_issues": [],
"db_config": [
{
"id": "339de11f-b0c2-4112-83ac-520cab68d89c",
"name": "notaryserver"
},
{
"id": "aa2e893f-a69a-463d-88c1-8102aafebebc",
"name": "dtr2"
},
{
"id": "bdf14a41-9c31-4526-8436-ab0fed00c2fd",
"name": "jobrunner"
},
{
"id": "f94f0e35-b7b1-4a2f-82be-1bdacca75039",
"name": "notarysigner"
}
],
"server_status": [
{
"id": "9c41fbc6-bcf2-4fad-8960-d117f2fdb06a",
"name": "dtr_rethinkdb_5eb9459a7832",
"network": {
"canonical_addresses": [
{
"host": "dtr-rethinkdb-5eb9459a7832.dtr-ol",
"port": 29015
}
],
"cluster_port": 29015,
"connected_to": {
"dtr_rethinkdb_56b65e8c1404": true
},
"hostname": "9e83e4fee173",
"http_admin_port": "<no http admin>",
"reql_port": 28015,
"time_connected": "2019-02-15T00:19:22.035Z"
},
}
...
]
}
}
When a DTR replica is unhealthy or down, the DTR web UI displays a warning:
Warning: The following replicas are unhealthy: 59e4e9b0a254; Reasons: Replica reported health too long ago: 2017-02-18T01:11:20Z; Replicas 000000000000, 563f02aba617 are still healthy.
To fix this, you should remove the unhealthy replica from the DTR cluster, and join a new one. Start by running:
docker run -it --rm \
docker/dtr:2.7.5 remove \
--ucp-insecure-tls
And then:
docker run -it --rm \
docker/dtr:2.7.5 join \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
Docker Trusted Registry is a clustered application. You can join multiple replicas for high availability.
For a DTR cluster to be healthy, a majority of its replicas (n/2 + 1) need to be healthy and be able to communicate with the other replicas. This is also known as maintaining quorum.
This means that there are three failure scenarios possible.
One or more replicas are unhealthy, but the overall majority (n/2 + 1) is still healthy and able to communicate with one another.
In this example the DTR cluster has five replicas but one of the nodes stopped working, and the other has problems with the DTR overlay network.
Even though these two replicas are unhealthy the DTR cluster has a majority of replicas still working, which means that the cluster is healthy.
In this case you should repair the unhealthy replicas, or remove them from the cluster and join new ones.
A majority of replicas are unhealthy, making the cluster lose quorum, but at least one replica is still healthy, or at least the data volumes for DTR are accessible from that replica.
In this example the DTR cluster is unhealthy but since one replica is still running it’s possible to repair the cluster without having to restore from a backup. This minimizes the amount of data loss.
This is a total disaster scenario where all DTR replicas were lost, causing the data volumes for all DTR replicas to get corrupted or lost.
In a disaster scenario like this, you’ll have to restore DTR from an existing backup. Restoring from a backup should be only used as a last resort, since doing an emergency repair might prevent some data loss.
When one or more DTR replicas are unhealthy but the overall majority (n/2 + 1) is healthy and able to communicate with one another, your DTR cluster is still functional and healthy.
Given that the DTR cluster is healthy, there’s no need to execute any disaster recovery procedures like restoring from a backup.
Instead, you should:
Since a DTR cluster requires a majority of replicas to be healthy at all times, the order of these operations is important. If you join more replicas before removing the ones that are unhealthy, your DTR cluster might become unhealthy.
To understand why you should remove unhealthy replicas before joining new ones, imagine you have a five-replica DTR deployment, and something goes wrong with the overlay network connection the replicas, causing them to be separated in two groups.
Because the cluster originally had five replicas, it can work as long as three replicas are still healthy and able to communicate (5 / 2 + 1 = 3). Even though the network separated the replicas in two groups, DTR is still healthy.
If at this point you join a new replica instead of fixing the network problem or removing the two replicas that got isolated from the rest, it’s possible that the new replica ends up in the side of the network partition that has less replicas.
When this happens, both groups now have the minimum amount of replicas needed to establish a cluster. This is also known as a split-brain scenario, because both groups can now accept writes and their histories start diverging, making the two groups effectively two different clusters.
To remove unhealthy replicas, you’ll first have to find the replica ID of one of the replicas you want to keep, and the replica IDs of the unhealthy replicas you want to remove.
You can find the list of replicas by navigating to Shared Resources > Stacks or Swarm > Volumes (when using swarm mode) on the UCP web interface, or by using the UCP client bundle to run:
docker ps --format "{{.Names}}" | grep dtr
# The list of DTR containers with <node>/<component>-<replicaID>, e.g.
# node-1/dtr-api-a1640e1c15b6
Another way to determine the replica ID is to SSH into a DTR node and run the following:
REPLICA_ID=$(docker inspect -f '{{.Name}}' $(docker ps -q -f name=dtr-rethink) | cut -f 3 -d '-')
&& echo $REPLICA_ID
Then use the UCP client bundle to remove the unhealthy replicas:
docker run -it --rm docker/dtr:2.7.5 remove \
--existing-replica-id <healthy-replica-id> \
--replica-ids <unhealthy-replica-id> \
--ucp-insecure-tls \
--ucp-url <ucp-url> \
--ucp-username <user> \
--ucp-password <password>
You can remove more than one replica at the same time, by specifying multiple IDs with a comma.
Once you’ve removed the unhealthy nodes from the cluster, you should join new ones to make sure your cluster is highly available.
Use your UCP client bundle to run the following command which prompts you for the necessary parameters:
docker run -it --rm \
docker/dtr:2.7.5 join \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
For a DTR cluster to be healthy, a majority of its replicas (n/2 + 1) need to be healthy and be able to communicate with the other replicas. This is known as maintaining quorum.
In a scenario where quorum is lost, but at least one replica is still accessible, you can use that replica to repair the cluster. That replica doesn’t need to be completely healthy. The cluster can still be repaired as the DTR data volumes are persisted and accessible.
Repairing the cluster from an existing replica minimizes the amount of data lost. If this procedure doesn’t work, you’ll have to restore from an existing backup.
When a majority of replicas are unhealthy, causing the overall DTR
cluster to become unhealthy, operations like docker login
,
docker pull
, and docker push
present internal server error
.
Accessing the /_ping
endpoint of any replica also returns the same
error. It’s also possible that the DTR web UI is partially or fully
unresponsive.
Use the docker/dtr emergency-repair
command to try to repair an
unhealthy DTR cluster, from an existing replica.
This command checks the data volumes for the DTR replica are uncorrupted, redeploys all internal DTR components and reconfigured them to use the existing volumes. It also reconfigures DTR removing all other nodes from the cluster, leaving DTR as a single-replica cluster with the replica you chose.
Start by finding the ID of the DTR replica that you want to repair from. You can find the list of replicas by navigating to Shared Resources > Stacks or Swarm > Volumes (when using swarm mode) on the UCP web interface, or by using a UCP client bundle to run:
docker ps --format "{{.Names}}" | grep dtr
# The list of DTR containers with <node>/<component>-<replicaID>, e.g.
# node-1/dtr-api-a1640e1c15b6
Another way to determine the replica ID is to SSH into a DTR node and run the following:
REPLICA_ID=$(docker inspect -f '{{.Name}}' $(docker ps -q -f name=dtr-rethink) | cut -f 3 -d '-')
&& echo $REPLICA_ID
Then, use your UCP client bundle to run the emergency repair command:
docker run -it --rm docker/dtr:2.7.5 emergency-repair \
--ucp-insecure-tls \
--existing-replica-id <replica-id>
If the emergency repair procedure is successful, your DTR cluster now has a single replica. You should now join more replicas for high availability.
If the emergency repair command fails, try running it again using a different replica ID. As a last resort, you can restore your cluster from an existing backup.
Docker Trusted Registry maintains data about:
Data | Description |
---|---|
Con figurations | The DTR cluster configurations |
Repository metadata | The metadata about the repositories and images deployed |
Access control to repos and images | Permissions for teams and repositories |
Notary data | Notary tags and signatures |
Scan results | Security scanning results for images |
C ertificates and keys | The certificates, public keys, and private keys that are used for mutual TLS communication |
Images content | The images you push to DTR. This can be stored on the file system of the node running DTR, or other storage system, depending on the configuration |
This data is persisted on the host running DTR, using named volumes. Learn more about DTR named volumes.
To perform a backup of a DTR node, run the docker/dtr backup <dtr-cli-backup> command. This command backs up the following data:
Data | Backed up | Description |
---|---|---|
Configurations | yes | DTR settings |
Repository metadata | yes | Metadata such as image architecture and size |
Access control to repos and images | yes | Data about who has access to which images |
Notary data | yes | Signatures and digests for images that are signed |
Scan results | yes | Information about vulnerabilities in your images |
Certificates and keys | yes | TLS certificates and keys used by DTR |
Image content | no | Needs to be backed up separately, depends on DTR configuration |
Users, orgs, teams | no | Create a UCP backup to back up this data |
Vulnerability database | no | Can be redownloaded after a restore |
To create a backup of DTR, you need to:
You should always create backups from the same DTR replica, to ensure a smoother restore. If you have not previously performed a backup, the web interface displays a warning for you to do so:
Since you need your DTR replica ID during a backup, the following covers a few ways for you to determine your replica ID:
You can find the list of replicas by navigating to Shared Resources > Stacks or Swarm > Volumes (when using swarm mode) on the UCP web interface.
From a terminal using a UCP client bundle, run:
docker ps --format "{{.Names}}" | grep dtr
# The list of DTR containers with <node>/<component>-<replicaID>, e.g.
# node-1/dtr-api-a1640e1c15b6
Another way to determine the replica ID is to log into a DTR node using SSH and run the following:
REPLICA_ID=$(docker ps --format '{{.Names}}' -f name=dtr-rethink | cut -f 3 -d '-')
&& echo $REPLICA_ID
Since you can configure the storage backend that DTR uses to store images, the way you back up images depends on the storage backend you’re using.
If you’ve configured DTR to store images on the local file system or NFS
mount, you can back up the images by using SSH to log into a DTR node,
and creating a tar
archive of the dtr-registry
volume.
sudo tar -cf dtr-image-backup-$(date +%Y%m%d-%H_%M_%S).tar \
/var/lib/docker/volumes/dtr-registry-$(docker ps --format '{{.Names}}' -f name=dtr-rethink | cut -f 3 -d '-')
Expected output
tar: Removing leading `/' from member names
If you’re using a different storage backend, follow the best practices recommended for that system.
To create a DTR backup, load your UCP client bundle, and run the following command.
DTR_VERSION=$(docker container inspect $(docker container ps -f name=dtr-registry -q) | \
grep -m1 -Po '(?<=DTR_VERSION=)\d.\d.\d'); \
REPLICA_ID=$(docker ps --format '{{.Names}}' -f name=dtr-rethink | cut -f 3 -d '-'); \
read -p 'ucp-url (The UCP URL including domain and port): ' UCP_URL; \
read -p 'ucp-username (The UCP administrator username): ' UCP_ADMIN; \
read -sp 'ucp password: ' UCP_PASSWORD; \
docker run --log-driver none -i --rm \
--env UCP_PASSWORD=$UCP_PASSWORD \
docker/dtr:$DTR_VERSION backup \
--ucp-username $UCP_ADMIN \
--ucp-url $UCP_URL \
--ucp-ca "$(curl https://${UCP_URL}/ca)" \
--existing-replica-id $REPLICA_ID > dtr-metadata-${DTR_VERSION}-backup-$(date +%Y%m%d-%H_%M_%S).tar
<ucp-url>
is the URL you use to access UCP.<ucp-username>
is the username of a UCP administrator.<replica-id>
is the DTR replica ID to back up.The above chained commands run through the following tasks: 1. Sets your
DTR version and replica ID. To back up a specific replica, set the
replica ID manually by modifying the --existing-replica-id
flag in
the backup command. 2. Prompts you for your UCP URL (domain and port)
and admin username. 3. Prompts you for your UCP password without saving
it to your disk or printing it on the terminal. 4. Retrieves the CA
certificate for your specified UCP URL. To skip TLS verification,
replace the --ucp-ca
flag with --ucp-insecure-tls
. Docker does
not recommend this flag for production environments. 5. Includes DTR
version and timestamp to your tar
backup file.
You can learn more about the supported flags in the DTR backup reference documentation.
By default, the backup command does not pause the DTR replica being
backed up to prevent interruptions of user access to DTR. Since the
replica is not stopped, changes that happen during the backup may not be
saved. Use the --offline-backup
flag to stop the DTR replica during
the backup procedure. If you set this flag, remove the replica from the
load balancing pool to avoid user interruption.
Also, the backup contains sensitive information like private keys, so you can encrypt the backup by running:
gpg --symmetric {{ metadata_backup_file }}
This prompts you for a password to encrypt the backup, copies the backup file and encrypts it.
To validate that the backup was correctly performed, you can print the contents of the tar file created. The backup of the images should look like:
tar -tf {{ metadata_backup_file }}
dtr-backup-v2.7.5/
dtr-backup-v2.7.5/rethink/
dtr-backup-v2.7.5/rethink/layers/
And the backup of the DTR metadata should look like:
tar -tf {{ metadata_backup_file }}
# The archive should look like this
dtr-backup-v2.7.5/
dtr-backup-v2.7.5/rethink/
dtr-backup-v2.7.5/rethink/properties/
dtr-backup-v2.7.5/rethink/properties/0
If you’ve encrypted the metadata backup, you can use:
gpg -d {{ metadata_backup_file }} | tar -t
You can also create a backup of a UCP cluster and restore it into a new cluster. Then restore DTR on that new cluster to confirm that everything is working as expected.
If your DTR has a majority of unhealthy replicas, the one way to restore it to a working state is by restoring from an existing backup.
To restore DTR, you need to:
You need to restore DTR on the same UCP cluster where you’ve created the backup. If you restore on a different UCP cluster, all DTR resources will be owned by users that don’t exist, so you’ll not be able to manage the resources, even though they’re stored in the DTR data store.
When restoring, you need to use the same version of the docker/dtr
image that you’ve used when creating the update. Other versions are not
guaranteed to work.
Start by removing any DTR container that is still running:
docker run -it --rm \
docker/dtr:2.7.5 destroy \
--ucp-insecure-tls
If you had DTR configured to store images on the local filesystem, you can extract your backup:
sudo tar -xf {{ image_backup_file }} -C /var/lib/docker/volumes
If you’re using a different storage backend, follow the best practices recommended for that system.
You can restore the DTR metadata with the docker/dtr restore
command. This performs a fresh installation of DTR, and reconfigures it
with the configuration created during a backup.
Load your UCP client bundle, and run the following command, replacing the placeholders for the real values:
read -sp 'ucp password: ' UCP_PASSWORD;
This prompts you for the UCP password. Next, run the following to restore DTR from your backup. You can learn more about the supported flags in docker/dtr restore.
docker run -i --rm \
--env UCP_PASSWORD=$UCP_PASSWORD \
docker/dtr:2.7.5 restore \
--ucp-url <ucp-url> \
--ucp-insecure-tls \
--ucp-username <ucp-username> \
--ucp-node <hostname> \
--replica-id <replica-id> \
--dtr-external-url <dtr-external-url> < {{ metadata_backup_file }}
Where:
<ucp-url>
is the url you use to access UCP<ucp-username>
is the username of a UCP administrator<hostname>
is the hostname of the node where you’ve restored the
images<replica-id>
the id of the replica you backed up<dtr-external-url>
the url that clients use to access DTRIf you’re using NFS as a storage backend, also include
--nfs-storage-url
as part of your restore command, otherwise DTR is
restored but starts using a local volume to persist your Docker images.
Warning
When running DTR 2.5 (with experimental online garbage collection)
and 2.6.0 to 2.6.3, there is an issue with reconfiguring and
restoring DTR with
`–nfs-storage-url`` <dtr-rn-2-6>` which
leads to erased tags. Make sure to back up your DTR
metadata
before you proceed. To work around the --nfs-storage-url
flag
issue, manually create a storage volume on each DTR node. To restore
DTR from an existing backup, use
docker/dtr restore
with --dtr-storage-volume
and the new
volume.
See Restore to a Local NFS Volume for Docker’s recommended recovery strategy.
If you’re scanning images, you now need to download the vulnerability database.
After you successfully restore DTR, you can join new replicas the same way you would after a fresh installation. Learn more.
By default Docker Engine uses TLS when pushing and pulling images to an image registry like Docker Trusted Registry.
If DTR is using the default configurations or was configured to use self-signed certificates, you need to configure your Docker Engine to trust DTR. Otherwise, when you try to log in, push to, or pull images from DTR, you’ll get an error:
docker login dtr.example.org
x509: certificate signed by unknown authority
The first step to make your Docker Engine trust the certificate authority used by DTR is to get the DTR CA certificate. Then you configure your operating system to trust that certificate.
In your browser navigate to https://<dtr-url>/ca
to download the TLS
certificate used by DTR. Then add that certificate to macOS
Keychain.
After adding the CA certificate to Keychain, restart Docker Desktop for Mac.
In your browser navigate to https://<dtr-url>/ca
to download the TLS
certificate used by DTR. Open Windows Explorer, right-click the file
you’ve downloaded, and choose Install certificate.
Then, select the following options:
Learn more about managing TLS certificates.
After adding the CA certificate to Windows, restart Docker Desktop for Windows.
# Download the DTR CA certificate
sudo curl -k https://<dtr-domain-name>/ca -o /usr/local/share/ca-certificates/<dtr-domain-name>.crt
# Refresh the list of certificates to trust
sudo update-ca-certificates
# Restart the Docker daemon
sudo service docker restart
# Download the DTR CA certificate
sudo curl -k https://<dtr-domain-name>/ca -o /etc/pki/ca-trust/source/anchors/<dtr-domain-name>.crt
# Refresh the list of certificates to trust
sudo update-ca-trust
# Restart the Docker daemon
sudo /bin/systemctl restart docker.service
Log into the virtual machine with ssh:
docker-machine ssh <machine-name>
Create the bootsync.sh
file, and make it executable:
sudo touch /var/lib/boot2docker/bootsync.sh
sudo chmod 755 /var/lib/boot2docker/bootsync.sh
Add the following content to the bootsync.sh
file. You can use
nano or vi for this.
#!/bin/sh
cat /var/lib/boot2docker/server.pem >> /etc/ssl/certs/ca-certificates.crt
Add the DTR CA certificate to the server.pem
file:
curl -k https://<dtr-domain-name>/ca | sudo tee -a /var/lib/boot2docker/server.pem
Run bootsync.sh
and restart the Docker daemon:
sudo /var/lib/boot2docker/bootsync.sh
sudo /etc/init.d/docker restart
To validate that your Docker daemon trusts DTR, try authenticating against DTR.
docker login dtr.example.org
Configure your Notary client as described in Delegations for content trust.
Docker Trusted Registry can be configured to have one or more caches. This allows you to choose from which cache to pull images from for faster download times.
If an administrator has set up caches, you can choose which cache to use when pulling images.
In the DTR web UI, navigate to your Account, and check the Content Cache options.
Once you save, your images are pulled from the cache instead of the central DTR.
Since DTR is secure by default, you need to create the image repository before being able to push the image to DTR.
In this example, we’ll create the wordpress
repository in DTR.
To create an image repository for the first time, log in to
https://<dtr-url>
with your UCP credentials.
Select Repositories from the left navigation pane and click New repository on the upper right corner of the Repositories page.
Select your namespace and enter a name for your repository. You can optionally add a description.
Choose whether your repository is public
or private
:
Click Create to create the repository.
When creating a repository in DTR, the full name of the repository
becomes <dtr-domain-name>/<user-or-org>/<repository-name>
. In
this example, the full name of our repository will be
dtr-example.com/test-user-1/wordpress
.
Optional: Click Show advanced settings to make your tags immutable or set your image scanning trigger.
Note
Starting in DTR 2.6, repository admins can enable tag pruning by setting a tag limit. This can only be set if you turn off Immutability and allow your repository tags to be overwritten.
Image name size for DTR
When creating an image name for use with DTR ensure that the organization and repository name has less than 56 characters and that the entire image name which includes domain, organization and repository name does not exceed 255 characters.
The 56-character <user-or-org/repository-name> limit in DTR is due to an underlying limitation in how the image name information is stored within DTR metadata in RethinkDB. RethinkDB currently has a Primary Key length limit of 127 characters.
When DTR stores the above data it appends a sha256sum comprised of 72 characters to the end of the value to ensure uniqueness within the database. If the <user-or-org/repository-name> exceeds 56 characters it will then exceed the 127 character limit in RethinkDB (72+56=128).
Multi-architecture images
While you can enable just-in-time creation of multi-archictecture image repositories when creating a repository via the API, Docker does not recommend using this option. This breaks content trust and causes other issues. To manage Docker image manifests and manifest lists, use the experimental CLI command, docker manifest, instead.
The Repository Info tab includes the following details:
To learn more about pulling images, see Pull and push images. To review your repository permissions, do the following:
Navigate to https://<dtr-url>
and log in with your UCP
credentials.
Select Repositories on the left navigation pane, and then click
on the name of the repository that you want to view. Note that you
will have to click on the repository name following the /
after
the specific namespace for your repository.
You should see the Info tab by default. Notice Your Permission under Docker Pull Command.
Hover over the question mark next to your permission level to view the list of repository events you have access to.
Limitations
Your permissions list may include repository events that are not displayed in the Activity tab. It is also not an exhaustive list of event types displayed on your activity stream. To learn more about repository events, see Audit Repository Events.
You interact with Docker Trusted registry in the same way you interact with Docker Hub or any other registry:
docker login <dtr-url>
: authenticates you on DTRdocker pull <image>:<tag>
: pulls an image from DTRdocker push <image>:<tag>
: pushes an image to DTRPulling an image from Docker Trusted Registry is the same as pulling an image from Docker Hub or any other registry. Since DTR is secure by default, you always need to authenticate before pulling images.
In this example, DTR can be accessed at dtr-example.com
, and the user was
granted permissions to access the nginx
and wordpress
repositories in the library
organization.
Click on the repository name to see its details.
To pull the latest tag of the library/wordpress
image, run:
docker login dtr-example.com
docker pull dtr-example.com/library/wordpress:latest
Before you can push an image to DTR, you need to create a
repository to store the image. In this example the full
name of our repository is dtr-example.com/library/wordpress
.
In this example we’ll pull the wordpress image from Docker Hub and tag with the full DTR and repository name. A tag defines where the image was pulled from, and where it will be pushed to.
# Pull from Docker Hub the latest tag of the wordpress image
docker pull wordpress:latest
# Tag the wordpress:latest image with the full repository name we've created in DTR
docker tag wordpress:latest dtr-example.com/library/wordpress:latest
Now that you have tagged the image, you only need to authenticate and push the image to DTR.
docker login dtr-example.com
docker push dtr-example.com/library/wordpress:latest
On the web interface, navigate to the Tags tab on the repository page to confirm that the tag was successfully pushed.
The base layers of the Microsoft Windows base images have restrictions on how they can be redistributed. When you push a Windows image to DTR, Docker only pushes the image manifest and all the layers on top of the Windows base layers. The Windows base layers are not pushed to DTR. This means that:
This default behavior is recommended for standard Docker EE installations, but for air-gapped or similarly limited setups Docker can optionally optionally also push the Windows base layers to DTR.
To configure Docker to always push Windows layers to DTR, add the
following to your C:\ProgramData\docker\config\daemon.json
configuration file:
"allow-nondistributable-artifacts": ["<dtr-domain>:<dtr-port>"]
To delete an image, navigate to the Tags tab of the repository page on the DTR web interface. In the Tags tab, select all the image tags you want to delete, and click the Delete button.
You can also delete all image versions by deleting the repository. To delete a repository, navigate to Settings and click Delete under Delete Repository.
DTR only allows deleting images if the image has not been signed. You first need to delete all the trust data associated with the image before you are able to delete the image.
There are three steps to delete a signed image:
To find which roles signed an image, you first need to learn which roles are trusted to sign the image.
Configure your Notary client and run:
notary delegation list dtr-example.com/library/wordpress
In this example, the repository owner delegated trust to the
targets/releases
and targets/qa
roles:
ROLE PATHS KEY IDS THRESHOLD
---- ----- ------- ---------
targets/releases "" <all paths> c3470c45cefde5...2ea9bc8 1
targets/qa "" <all paths> c3470c45cefde5...2ea9bc8 1
Now that you know which roles are allowed to sign images in this repository, you can learn which roles actually signed it:
# Check if the image was signed by the "targets" role
notary list dtr-example.com/library/wordpress
# Check if the image was signed by a specific role
notary list dtr-example.com/library/wordpress --roles <role-name>
In this example the image was signed by three roles: targets
,
targets/releases
, and targets/qa
.
Once you know which roles signed an image, you’ll be able to remove trust data for those roles. Only users with private keys that have the roles are able to do this operation.
For each role that signed the image, run:
notary remove dtr-example.com/library/wordpress <tag> \
--roles <role-name> --publish
Once you’ve removed trust data for all roles, DTR shows the image as unsigned. Then you can delete it.
Docker Trusted Registry can scan images in your repositories to verify that they are free from known security vulnerabilities or exposures, using Docker Security Scanning. The results of these scans are reported for each image tag in a repository.
Docker Security Scanning is available as an add-on to Docker Trusted Registry, and an administrator configures it for your DTR instance. If you do not see security scan results available on your repositories, your organization may not have purchased the Security Scanning feature or it may be disabled. See Set up Security Scanning in DTR for more details.
Note
Only users with write access to a repository can manually start a scan. Users with read-only access can view the scan results, but cannot start a new scan.
Scans run either on demand when you click the Start a Scan link or
Scan button (see manual-scanning> below), or
automatically on any docker push
to the repository.
First the scanner performs a binary scan on each layer of the image, identifies the software components in each layer, and indexes the SHA of each component in a bill-of-materials. A binary scan evaluates the components on a bit-by-bit level, so vulnerable components are discovered even if they are statically linked or under a different name.
The scan then compares the SHA of each component against the US National Vulnerability Database that is installed on your DTR instance. When this database is updated, DTR reviews the indexed components for newly discovered vulnerabilities.
DTR scans both Linux and Windows images, but by default Docker doesn’t push foreign image layers for Windows images so DTR won’t be able to scan them. If you want DTR to scan your Windows images, configure Docker to always push image layers <pull-and-push-images>, and it will scan the non-foreign layers.
By default, Docker Security Scanning runs automatically on
docker push
to an image repository.
If your DTR instance is configured in this way, you do not need to do
anything once your docker push
completes. The scan runs
automatically, and the results are reported in the repository’s Tags
tab after the scan finishes.
If your repository owner enabled Docker Security Scanning but disabled
automatic scanning, you can manually start a scan for images in
repositories you have write
access to.
To start a security scan, navigate to the repository Tags tab on the web interface, click “View details” next to the relevant tag, and click Scan.
DTR begins the scanning process. You will need to refresh the page to see the results once the scan is complete.
You can change the scanning mode for each individual repository at any time. You might want to disable scanning if you are pushing an image repeatedly during troubleshooting and don’t want to waste resources scanning and re-scanning, or if a repository contains legacy code that is not used or updated frequently.
Note
To change an individual repository’s scanning mode, you
must have write
or administrator
access to the repo.
To change the repository scanning mode:
Once DTR has run a security scan for an image, you can view the results.
The Tags tab for each repository includes a summary of the most recent scan results for each image.
The text Clean in green indicates that the scan did not find any vulnerabilities.
A red or orange text indicates that vulnerabilities were found, and the number of vulnerabilities is included on that same line according to severity: Critical, Major, Minor.
If the vulnerability scan could not detect the version of a component, it reports the vulnerabilities for all versions of that component.
From the repository Tags tab, you can click View details for a specific tag to see the full scan results. The top of the page also includes metadata about the image, including the SHA, image size, last push date, user who initiated the push, the security scan summary, and the security scan progress.
The scan results for each image include two different modes so you can quickly view details about the image, its components, and any vulnerabilities found.
The Layers view lists the layers of the image in the order that they are built by Dockerfile.
This view can help you find exactly which command in the build introduced the vulnerabilities, and which components are associated with that single command. Click a layer to see a summary of its components. You can then click on a component to switch to the Component view and get more details about the specific item.
Note
The layers view can be long, so be sure to scroll down if you don’t immediately see the reported vulnerabilities.
The Components view lists the individual component libraries indexed by the scanning system, in order of severity and number of vulnerabilities found, with the most vulnerable library listed first.
Click on an individual component to view details about the vulnerability it introduces, including a short summary and a link to the official CVE database report. A single component can have multiple vulnerabilities, and the scan report provides details on each one. The component details also include the license type used by the component, and the filepath to the component in the image.
If you find that an image in your registry contains vulnerable components, you can use the linked CVE scan information in each scan report to evaluate the vulnerability and decide what to do.
If you discover vulnerable components, you should check if there is an updated version available where the security vulnerability has been addressed. If necessary, you can contact the component’s maintainers to ensure that the vulnerability is being addressed in a future version or a patch update.
If the vulnerability is in a base layer
(such as an operating
system) you might not be able to correct the issue in the image. In this
case, you can switch to a different version of the base layer, or you
can find an equivalent, less vulnerable base layer.
Address vulnerabilities in your repositories by updating the images to use updated and corrected versions of vulnerable components, or by using a different component offering the same functionality. When you have updated the source code, run a build to create a new image, tag the image, and push the updated image to your DTR instance. You can then re-scan the image to confirm that you have addressed the vulnerabilities.
DTR scans your images for vulnerabilities but sometimes it can report that your image has vulnerabilities you know have been fixed. If that happens you can dismiss the warning.
In the DTR web interface, navigate to the repository that has been scanned.
Click View details to review the image scan results, and choose Components to see the vulnerabilities for each component packaged in the image.
Select the component with the vulnerability you want to ignore, navigate to the vulnerability, and click hide.
The vulnerability is hidden system-wide and will no longer be reported as a vulnerability on affected images with the same layer IDs or digests.
After dismissing a vulnerability, DTR will not reevaluate the promotion policies you have set up for the repository.
If you want the promotion policy to be reevaluated for the image after hiding a particular vulnerability, click Promote.
By default, users with read and write access to a repository can push the same tag
multiple times to that repository. For example, when user A pushes an image
to library/wordpress:latest
, there is no preventing user B from pushing
an image with the same name but a completely different functionality. This can
make it difficult to trace the image back to the build that generated it.
To prevent tags from being overwritten, you can configure a repository to be immutable. Once configured, DTR will not allow anyone else to push another image tag with the same name.
You can enable tag immutability on a repository when you create it, or at any time after.
If you’re not already logged in, navigate to https://<dtr-url>
and
log in with your UCP credentials. To make tags immutable on a new
repository, do the following:
Select Repositories on the left navigation pane, and then click
on the name of the repository that you want to view. Note that you
will have to click on the repository name following the /
after
the specific namespace for your repository.
From now on, you will get an error message when trying to push a tag that already exists:
docker push dtr-example.com/library/wordpress:latest
unknown: tag=latest cannot be overwritten because
dtr-example.com/library/wordpress is an immutable repository
Two key components of the Docker Trusted Registry are the Notary Server and the Notary Signer. These two containers provide the required components for using Docker Content Trust (DCT) out of the box. Docker Content Trust allows you to sign image tags, therefore giving consumers a way to verify the integrity of your image.
As part of DTR, both the Notary and the Registry servers are accessed through a front-end proxy, with both components sharing the UCP’s RBAC (Role-based Access Control) Engine. Therefore, you do not need additional Docker client configuration in order to use DCT.
DCT is integrated with the Docker CLI, and allows you to:
UCP has a feature which will prevent untrusted
images from being deployed on the cluster. To
use the feature, you need to sign and push images to your DTR. To tie the
signed images back to UCP, you need to sign the images with the private keys of
the UCP users. From a UCP client bundle, use key.pem
as your private key,
and cert.pem
as your public key on an x509
certificate.
To sign images in a way that UCP can trust, you need to:
The following example shows the nginx
image getting pulled from
Docker Hub, tagged as dtr.example.com/dev/nginx:1
, pushed to DTR,
and signed in a way that is trusted by UCP.
After downloading and extracting a UCP client bundle into your local
directory, you need to load the private key into the local Docker trust
store (~/.docker/trust)
. To illustrate the process, we will use
jeff
as an example user.
$ docker trust key load --name jeff key.pem
Loading key from "key.pem"...
Enter passphrase for new jeff key with ID a453196:
Repeat passphrase for new jeff key with ID a453196:
Successfully imported key from key.pem
Next,initiate trust metadata for a DTR repository. If you have not
already done so, navigate to the DTR web UI, and create a repository
for your image. This example uses the nginx
repository in the
prod
namespace.
As part of initiating the repository, the public key of the UCP user needs to be added to the Notary server as a signer for the repository. You will be asked for a number of passphrases to protect the keys.Make a note of these passphrases.
$ docker trust signer add --key cert.pem jeff dtr.example.com/prod/nginx
Adding signer "jeff" to dtr.example.com/prod/nginx...
Initializing signed repository for dtr.example.com/prod/nginx...
Enter passphrase for root key with ID 4a72d81:
Enter passphrase for new repository key with ID e0d15a2:
Repeat passphrase for new repository key with ID e0d15a2:
Successfully initialized "dtr.example.com/prod/nginx"
Successfully added signer: jeff to dtr.example.com/prod/nginx
Inspect the trust metadata of the repository to make sure the user has been added correctly.
$ docker trust inspect --pretty dtr.example.com/prod/nginx
No signatures for dtr.example.com/prod/nginx
List of signers and their keys for dtr.example.com/prod/nginx
SIGNER KEYS
jeff 927f30366699
Administrative keys for dtr.example.com/prod/nginx
Repository Key: e0d15a24b7...540b4a2506b
Root Key: b74854cb27...a72fbdd7b9a
Finally, user jeff
can sign an image tag. The following steps
include downloading the image from Hub, tagging the image for Jeff’s DTR
repository, pushing the image to Jeff’s DTR, as well as signing the tag
with Jeff’s keys.
$ docker pull nginx:latest
$ docker tag nginx:latest dtr.example.com/prod/nginx:1
$ docker trust sign dtr.example.com/prod/nginx:1
Signing and pushing trust data for local image dtr.example.com/prod/nginx:1, may overwrite remote trust data
The push refers to repository [dtr.example.com/prod/nginx]
6b5e2ed60418: Pushed
92c15149e23b: Pushed
0a07e81f5da3: Pushed
1: digest: sha256:5b49c8e2c890fbb0a35f6050ed3c5109c5bb47b9e774264f4f3aa85bb69e2033 size: 948
Signing and pushing trust metadata
Enter passphrase for jeff key with ID 927f303:
Successfully signed dtr.example.com/prod/nginx:1
Inspect the trust metadata again to make sure the image tag has been signed successfully.
$ docker trust inspect --pretty dtr.example.com/prod/nginx:1
Signatures for dtr.example.com/prod/nginx:1
SIGNED TAG DIGEST SIGNERS
1 5b49c8e2c8...90fbb2033 jeff
List of signers and their keys for dtr.example.com/prod/nginx:1
SIGNER KEYS
jeff 927f30366699
Administrative keys for dtr.example.com/prod/nginx:1
Repository Key: e0d15a24b74...96540b4a2506b
Root Key: b74854cb27c...1ea72fbdd7b9a
Alternatively, you can review the signed image from the DTR web UI.
You have the option to sign an image using multiple UCP users’ keys. For
example, an image needs to be signed by a member of the Security
team and a member of the Developers
team. Let’s assume jeff
is a
member of the Developers team. In this case, we only need to add a
member of the Security team.
To do so, first add the private key of the Security team member to the local Docker trust store.
$ docker trust key load --name ian key.pem
Loading key from "key.pem"...
Enter passphrase for new ian key with ID 5ac7d9a:
Repeat passphrase for new ian key with ID 5ac7d9a:
Successfully imported key from key.pem
Upload the user’s public key to the Notary Server and sign the image.
You will be asked for jeff
, the developer’s passphrase, as well as
the ian
user’s passphrase to sign the tag.
$ docker trust signer add --key cert.pem ian dtr.example.com/prod/nginx
Adding signer "ian" to dtr.example.com/prod/nginx...
Enter passphrase for repository key with ID e0d15a2:
Successfully added signer: ian to dtr.example.com/prod/nginx
$ docker trust sign dtr.example.com/prod/nginx:1
Signing and pushing trust metadata for dtr.example.com/prod/nginx:1
Existing signatures for tag 1 digest 5b49c8e2c890fbb0a35f6050ed3c5109c5bb47b9e774264f4f3aa85bb69e2033 from:
jeff
Enter passphrase for jeff key with ID 927f303:
Enter passphrase for ian key with ID 5ac7d9a:
Successfully signed dtr.example.com/prod/nginx:1
Finally, check the tag again to make sure it includes two signers.
$ docker trust inspect --pretty dtr.example.com/prod/nginx:1
Signatures for dtr.example.com/prod/nginx:1
SIGNED TAG DIGEST SIGNERS
1 5b49c8e2c89...5bb69e2033 jeff, ian
List of signers and their keys for dtr.example.com/prod/nginx:1
SIGNER KEYS
jeff 927f30366699
ian 5ac7d9af7222
Administrative keys for dtr.example.com/prod/nginx:1
Repository Key: e0d15a24b741ab049470298734397afbea539400510cb30d3b996540b4a2506b
Root Key: b74854cb27cc25220ede4b08028967d1c6e297a759a6939dfef1ea72fbdd7b9a
If an administrator wants to delete a DTR repository that contains trust metadata, they will be prompted to delete the trust metadata first before removing the repository.
To delete trust metadata, you need to use the Notary CLI.
$ notary delete dtr.example.com/prod/nginx --remote
Deleting trust data for repository dtr.example.com/prod/nginx
Enter username: admin
Enter password:
Successfully deleted local and remote trust data for repository dtr.example.com/prod/nginx
If you don’t include the --remote
flag, Notary deletes local cached
content but will not delete data from the Notary server.
For more advanced deployments, you may want to share one Docker Trusted Registry across multiple Universal Control Planes. However, customers wanting to adopt this model alongside the Only Run Signed Images UCP feature, run into problems as each UCP operates an independent set of users.
Docker Content Trust (DCT) gets around this problem, since users from a remote UCP are able to sign images in the central DTR and still apply runtime enforcement.
In the following example, we will connect DTR managed by UCP cluster 1 with a remote UCP cluster which we are calling UCP cluster 2, sign the image with a user from UCP cluster 2, and provide runtime enforcement within UCP cluster 2. This process could be repeated over and over, integrating DTR with multiple remote UCP clusters, signing the image with users from each environment, and then providing runtime enforcement in each remote UCP cluster separately.
Note
Before attempting this guide, familiarize yourself with Docker Content Trust and Only Run Signed Images on a single UCP. Many of the concepts within this guide may be new without that background.
curl https://dtr.example.com
.As there is no registry running within cluster 2, by default UCP will not know where to check for trust data. Therefore, the first thing we need to do is register DTR within the remote UCP in cluster 2. When you normally install DTR, this registration process happens by default to a local UCP, or cluster 1.
Note
The registration process allows the remote UCP to get signature data from DTR, however this will not provide Single Sign On (SSO). Users on cluster 2 will not be synced with cluster 1’s UCP or DTR. Therefore when pulling images, registry authentication will still need to be passed as part of the service definition if the repository is private. See the Kubernetes example.
To add a new registry, retrieve the Certificate Authority (CA) used to
sign the DTR TLS Certificate through the DTR URL’s /ca
endpoint.
$ curl -ks https://dtr.example.com/ca > dtr.crt
Next, convert the DTR certificate into a JSON configuration file for registration within the UCP for cluster 2.
You can find a template of the dtr-bundle.json
below. Replace the
host address with your DTR URL, and enter the contents of the DTR CA
certificate between the new line commands \n and \n
.
Note
JSON Formatting
Ensure there are no line breaks between each line of the DTR CA certificate within the JSON file. Use your favorite JSON formatter for validation.
$ cat dtr-bundle.json
{
"hostAddress": "dtr.example.com",
"caBundle": "-----BEGIN CERTIFICATE-----\n<contents of cert>\n-----END CERTIFICATE-----"
}
Now upload the configuration file to cluster 2’s UCP through the UCP API
endpoint, /api/config/trustedregistry_
. To authenticate against the
API of cluster 2’s UCP, we have downloaded a UCP client
bundle,
extracted it in the current directory, and will reference the keys for
authentication.
$ curl --cacert ca.pem --cert cert.pem --key key.pem \
-X POST \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d @dtr-bundle.json \
https://cluster2.example.com/api/config/trustedregistry_
Navigate to the UCP web interface to verify that the JSON file was imported successfully, as the UCP endpoint will not output anything. Select Admin > Admin Settings > Docker Trusted Registry. If the registry has been added successfully, you should see the DTR listed.
Additionally, you can check the full configuration
file within cluster 2’s UCP. Once downloaded, the
ucp-config.toml
file should now contain a section called [registries]
$ curl --cacert ca.pem --cert cert.pem --key key.pem https://cluster2.example.com/api/ucp/config-toml > ucp-config.toml
If the new registry isn’t shown in the list, check the
ucp-controller
container logs on cluster 2.
We will now sign an image and push this to DTR. To sign images we need a
user’s public private key pair from cluster 2. It can be found in a
client bundle, with key.pem
being a private key and cert.pem
being the public key on an X.509 certificate.
First, load the private key into the local Docker trust store
(~/.docker/trust)
. The name used here is purely metadata to help
keep track of which keys you have imported.
$ docker trust key load --name cluster2admin key.pem
Loading key from "key.pem"...
Enter passphrase for new cluster2admin key with ID a453196:
Repeat passphrase for new cluster2admin key with ID a453196:
Successfully imported key from key.pem
Next initiate the repository, and add the public key of cluster 2’s user as a signer. You will be asked for a number of passphrases to protect the keys. Keep note of these passphrases, and see [Docker Content Trust documentation] (/engine/security/trust/trust_delegation/#managing-delegations-in-a-notary-server) to learn more about managing keys.
$ docker trust signer add --key cert.pem cluster2admin dtr.example.com/admin/trustdemo
Adding signer "cluster2admin" to dtr.example.com/admin/trustdemo...
Initializing signed repository for dtr.example.com/admin/trustdemo...
Enter passphrase for root key with ID 4a72d81:
Enter passphrase for new repository key with ID dd4460f:
Repeat passphrase for new repository key with ID dd4460f:
Successfully initialized "dtr.example.com/admin/trustdemo"
Successfully added signer: cluster2admin to dtr.example.com/admin/trustdemo
Finally, sign the image tag. This pushes the image up to DTR, as well as signs the tag with the user from cluster 2’s keys.
$ docker trust sign dtr.example.com/admin/trustdemo:1
Signing and pushing trust data for local image dtr.example.com/admin/trustdemo:1, may overwrite remote trust data
The push refers to repository [dtr.olly.dtcntr.net/admin/trustdemo]
27c0b07c1b33: Layer already exists
aa84c03b5202: Layer already exists
5f6acae4a5eb: Layer already exists
df64d3292fd6: Layer already exists
1: digest: sha256:37062e8984d3b8fde253eba1832bfb4367c51d9f05da8e581bd1296fc3fbf65f size: 1153
Signing and pushing trust metadata
Enter passphrase for cluster2admin key with ID a453196:
Successfully signed dtr.example.com/admin/trustdemo:1
Within the DTR web interface, you should now be able to see your newly pushed tag with the Signed text next to the size.
You could sign this image multiple times if required, whether it’s multiple teams from the same cluster wanting to sign the image, or you integrating DTR with more remote UCPs so users from clusters 1, 2, 3, or more can all sign the same image.
We can now enable Only Run Signed Images on the remote UCP. To do this, login to cluster 2’s UCP web interface as an admin. Select Admin > Admin Settings > Docker Content Trust.
See Run only the images you trust for more information on only running signed images in UCP.
Finally we can now deploy a workload on cluster 2, using a signed image
from a DTR running on cluster 1. This workload could be a simple
$ docker run
, a Swarm Service, or a Kubernetes workload. As a simple
test, source a client bundle, and try running one of your signed images.
$ source env.sh
$ docker service create dtr.example.com/admin/trustdemo:1
nqsph0n6lv9uzod4lapx0gwok
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
nqsph0n6lv9u laughing_lamarr replicated 1/1 dtr.example.com/admin/trustdemo:1
If the image is stored in a private repository within DTR, you need to pass credentials to the Orchestrator as there is no SSO between cluster 2 and DTR. See the relevant Kubernetes documentation for more details.
image or trust data does not exist for dtr.example.com/admin/trustdemo:1
This means something went wrong when initiating the repository or signing the image, as the tag contains no signing data.
Error response from daemon: image did not meet required signing policy
dtr.example.com/admin/trustdemo:1: image did not meet required signing policy
This means that the image was signed correctly, however the user who signed the image does not meet the signing policy in cluster 2. This could be because you signed the image with the wrong user keys.
Error response from daemon: dtr.example.com must be a registered trusted registry. See 'docker run --help'.
This means you have not registered DTR to work with a remote UCP instance yet, as outlined in Registering DTR with a remote Universal Control Plane.
Docker Trusted Registry allows you to automatically promote and mirror images based on a policy. This way you can create a Docker-centric development pipeline.
You can mix and match promotion policies, mirroring policies, and webhooks to create flexible development pipelines that integrate with your existing CI/CD systems.
Promote an image using policies
One way to create a promotion pipeline is to automatically promote images to another repository.
You start by defining a promotion policy that’s specific to a repository. When someone pushes an image to that repository, DTR checks if it complies with the policy you set up and automatically pushes the image to another repository.
Learn how to promote an image using policies.
Mirror images to another registry
You can also promote images between different DTR deployments. This not only allows you to create promotion policies that span multiple DTRs, but also allows you to mirror images for security and high availability.
You start by configuring a repository with a mirroring policy. When someone pushes an image to that repository, DTR checks if the policy is met, and if so pushes it to another DTR deployment or Docker Hub.
Learn how to mirror images to another registry.
Mirror images from another registry
Another option is to mirror images from another DTR deployment. You configure a repository to poll for changes in a remote repository. All new images pushed into the remote repository are then pulled into DTR.
This is an easy way to configure a mirror for high availability since you won’t need to change firewall rules that are in place for your environments.
Docker Trusted Registry allows you to create image promotion pipelines based on policies.
In this example we will create an image promotion pipeline such that:
dev/website
repository.-stable
.dev/website
repository, it
will automatically be promoted to qa/website
so that the QA team
can start testing.With this promotion policy, the development team doesn’t need access to the QA repositories, and the QA team doesn’t need access to the development repositories.
Once you’ve Create a repository, navigate to the repository page on the DTR web interface, and select the Promotions tab.
Note
Only administrators can globally create and edit promotion policies. By default users can only create and edit promotion policies on repositories within their user namespace. For more information on user permissions, see Authentication and Authorization.
Click New promotion policy, and define the image promotion criteria.
DTR allows you to set your promotion policy based on the following image attributes:
Name | Description | Example |
---|---|---|
Tag name | Whether the tag name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Promote to Target if Tag name ends in stable |
Component | Whether the image has a given component and the component name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Promote to Target if Component name starts with b |
Vulnarabilities | Whether the image has vulnerabilities – critical, major, minor, or all – and your selected vulnerability filter is greater than or equals, greater than, equals, not equals, less than or equals, or less than your specified number | Promote to Target if Critical vulnerabilities = 3 |
License | Whether the image uses an intellectual property license and is one of or not one of your specified words | Promote to Target if License name = docker |
Now you need to choose what happens to an image that meets all the criteria.
Select the target organization or namespace and repository where the image is going to be pushed. You can choose to keep the image tag, or transform the tag into something more meaningful in the destination repository, by using a tag template.
In this example, if an image in the dev/website
is tagged with a
word that ends in “stable”, DTR will automatically push that image to
the qa/website
repository. In the destination repository the image
will be tagged with the timestamp of when the image was promoted.
Everything is set up! Once the development team pushes an image that
complies with the policy, it automatically gets promoted. To confirm,
select the Promotions tab on the dev/website
repository.
You can also review the newly pushed tag in the target repository by
navigating to qa/website
and selecting the Tags tab.
Docker Trusted Registry allows you to create mirroring policies for a repository. When an image gets pushed to a repository and meets the mirroring criteria, DTR automatically pushes it to a repository in a remote Docker Trusted or Hub registry.
This not only allows you to mirror images but also allows you to create image promotion pipelines that span multiple DTR deployments and datacenters.
In this example we will create an image mirroring policy such that:
dtr-example.com/dev/website
the repository in the DTR
deployment dedicated to development.-stable
.dtr-example.com/dev/website
, it
will automatically be pushed to qa-example.com/qa/website
,
mirroring the image and promoting it to the next stage of
development.With this mirroring policy, the development team does not need access to the QA cluster, and the QA team does not need access to the development cluster.
You need to have permissions to push to the destination repository in order to set up the mirroring policy.
Once you have Create a repository, navigate to the repository page on the web interface, and select the Mirrors tab.
Click New mirror, and define where the image will be pushed if it meets the mirroring criteria. Make sure the account you use for the integration has permissions to write to the remote repository. Under Mirror direction, choose Push to remote registry. Under Mirror direction, choose Push to remote registry.
In this example, the image gets pushed to the qa/website
repository of a
DTR deployment available at qa-example.com
using a service account
that was created just for mirroring images between repositories. Note that you
may use a password or access token to log in to your remote registry.
If the destination DTR deployment is using self-signed TLS certificates or certificates issued by your own certificate authority, click Show advanced settings to provide the CA certificate used by the DTR where the image will be pushed.
You can get that CA certificate by accessing https://<destination-dtr>/ca.
Once you’re done, click Connect to test the integration.
DTR allows you to set your mirroring policy based on the following image attributes:
Name | Description | Example |
---|---|---|
Tag name | Whether the tag name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Copy image to remote repository if Tag name ends in stable |
Component | Whether the image has a given component and the component name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Copy image to remote repository if Component name starts with b |
Vulnarabilities | Whether the image has vulnerabilities – critical, major, minor, or all – and your selected vulnerability filter is greater than or equals, greater than, equals, not equals, less than or equals, or less than your specified number | Copy image to remote repository if Critical vulnerabilities = 3 |
License | Whether the image uses an intellectual property license and is one of or not one of your specified words | Copy image to remote repository if License name = docker |
Finally you can choose to keep the image tag, or transform the tag into
In this example, if an image in the dev/website
repository is tagged
with a word that ends in stable
, DTR will automatically push that
image to the DTR deployment available at qa-example.com
. The image
is pushed to the qa/website
repository and is tagged with the
timestamp of when the image was promoted.
Everything is set up! Once the development team pushes an image that
complies with the policy, it automatically gets promoted to
qa/website
in the remote trusted registry at qa-example.com
.
When an image is pushed to another registry using a mirroring policy, scanning and signing data is not persisted in the destination repository.
If you have scanning enabled for the destination repository, DTR is going to scan the image pushed. If you want the image to be signed, you need to do it manually.
Docker Trusted Registry allows you to set up a mirror of a repository by constantly polling it and pulling new image tags as they are pushed. This ensures your images are replicated across different registries for high availability. It also makes it easy to create a development pipeline that allows different users access to a certain image without giving them access to everything in the remote registry.
To mirror a repository, start by Create a repository in the DTR deployment that will serve as your mirror. Previously, you were only able to set up pull mirroring from the API. Starting in DTR 2.6, you can also mirror and pull from a remote DTR or Docker Hub repository.
To get started:
Navigate to https://<dtr-url> and log in with your UCP credentials.
Select Repositories on the left navigation pane, and then click on the name of the repository that you want to view. Note that you will have to click on the repository name following the / after the specific namespace for your repository.
Select the Mirrors tab and click New mirror policy.
In the *New Mirror* page, specify the following details:
Mirror direction: Choose “Pull from remote registry”
Registry type: You can choose between Docker Trusted Registry and Docker Hub. If you choose DTR, enter your DTR URL. Otherwise, Docker Hub defaults to https://index.docker.io.
Username and Password or access token: Your credentials in the remote repository you wish to poll from. To use an access token instead of your password, see [authentication token](../access-tokens.md).
Repository: Enter the namespace and the repository_name after the /.
Show advanced settings: Enter the TLS details for the remote repository or check Skip TLS verification. If the DTR remote repository is using self-signed certificates or certificates signed by your own certificate authority, you also need to provide the public key certificate for that CA. You can retrieve the certificate by accessing https://<dtr-domain>/ca. “Remote certificate authority” is optional for a remote repository in Docker Hub.
Click Connect.
Once you have successfully connected to the remote repository, click Save to mirror future tags. To mirror all tags, click Save & Apply instead.
There are different ways to send your DTR API requests. To explore the different API resources and endpoints from the web interface, click API on the bottom left navigation pane.
Search for the endpoint:
POST /api/v0/repositories/{namespace}/{reponame}/pollMirroringPolicies
Click Try it out and enter your HTTP request details.
namespace
and reponame
refer to the repository that will be poll
mirrored. The boolean field, initialEvaluation
, corresponds to
Save when set to false
and will only mirror images created
after your API request. Setting it to true
corresponds to
Save & Apply which means all tags in the remote repository will
be evaluated and mirrored. The other body parameters correspond to the
relevant remote repository details that you can see on the DTR web
interface. As a best practice,
use a service account just for this purpose. Instead of providing the
password for that account, you should pass an authentication
token.
If the DTR remote repository is using self-signed certificates or
certificates signed by your own certificate authority, you also need to
provide the public key certificate for that CA. You can get it by
accessing https://<dtr-domain>/ca
. The remoteCA
field is
optional for mirroring a Docker Hub repository.
Click Execute . On success, the API returns an HTTP 201
response.
Once configured, the system polls for changes in the remote repository
and runs the poll_mirror
job every 30 minutes. On success, the
system will pull in new images and mirror them in your local repository.
Starting in DTR 2.6, you can filter for poll_mirror
jobs to review
when it was last ran. To manually trigger the job and force pull
mirroring, use the POST /api/v0/jobs
API endpoint and specify
poll_mirror
as your action.
curl -X POST "https:/<dtr-url>/api/v0/jobs" -H "accept: application/json" -H "content-type: application/json" -d "{ \"action\": \"poll_mirror\"}"
See Manage Jobs to learn more about job management within DTR.
When defining promotion policies you can use templates to dynamically name the tag that is going to be created.
You can use these template keywords to define your new tag:
Template | Description | Example result |
---|---|---|
%n |
The tag to promote | 1, 4.5, latest |
%A |
Day of the week | Sunday, Monday |
%a |
Day of the week, abbreviated | Sun, Mon, Tue |
%w |
Day of the week, as a number | 0, 1, 6 |
%d |
Number for the day of the month | 01, 15, 31 |
%B |
Month | January, December |
%b |
Month, abbreviated | Jan, Jun, Dec |
%m |
Month, as a number | 01, 06, 12 |
%Y |
Year | 1999, 2015, 2048 |
%y |
Year, two digits | 99, 15, 48 |
%H |
Hour, in 24 hour format | 00, 12, 23 |
%I |
Hour, in 12 hour format | 01, 10, 10 |
%p |
Period of the day | AM, PM |
%M |
Minute | 00, 10, 59 |
%S |
Second | 00, 10, 59 |
%f |
Microsecond | 000000, 999999 |
%Z |
Name for the timezone | UTC, PST, EST |
%j |
Day of the year | 001, 200, 366 |
%W |
Week of the year | 00, 10, 53 |
Starting in DTR 2.6, each repository page includes an Activity tab which displays a sortable and paginated list of the most recent events within the repository. This offers better visibility along with the ability to audit events. Event types listed will vary according to your repository permission level. Additionally, DTR admins can enable auto-deletion of repository events as part of maintenance and cleanup.
In the following section, we will show you how to view and audit the list of events in a repository. We will also cover the event types associated with your permission level.
As of DTR 2.3, admins were able to view a list of DTR events using the API. DTR 2.6 enhances that feature by showing a permission-based events list for each repository page on the web interface. To view the list of events within a repository, do the following:
Navigate to https://<dtr-url>
and log in with your DTR credentials.
Select Repositories from the left navigation pane, and then
click on the name of the repository that you want to view. Note that you
will have to click on the repository name following the /
after
the specific namespace for your repository.
Select the Activity tab. You should see a paginated list of the
latest events based on your repository permission level. By default,
Activity shows the latest 10
events and excludes pull
events, which are only visible to repository and DTR admins.
The following table breaks down the data included in an event and uses
the highlighted Create Promotion Policy
event as an example.
Event detail | Description | Example |
---|---|---|
Label | Friendly name of the event. | Create Promotion Policy |
Repository | This will always be the repository in review following the
<user-or-org>/<repository_name> convention outlined in
Create a repository |
test-org/test-repo-1 |
Tag | Tag affected by the event, when applicable. | test-org/test-repo-1:latest where latest is the affected tag |
SHA | The digest value for ``CREATE` operations such as creating a new image tag or a promotion policy. | sha256:bbf09ba3 |
Type | Event type. Possible values are: CREATE , GET , UPDATE ,
DELETE , SEND , FAIL and SCAN . |
CREATE |
Initiated by | The actor responsible for the event. For user-initiated events, this
will reflect the user ID and link to that user’s profile. For image
events triggered by a policy – pruning, pull / push mirroring, or
promotion – this will reflect the relevant policy ID except for manual
promotions where it reflects PROMOTION MANUAL_P , and link to the
relevant policy page. Other event actors may not include a link. |
PROMOTION CA5E7822 |
Date and Time | When the event happened in your configured time zone. | 2018 9:59 PM |
Given the level of detail on each event, it should be easy for DTR and security admins to determine what events have taken place inside of DTR. For example, when an image which shouldn’t have been deleted ends up getting deleted, the security admin can determine when and who initiated the deletion.
For more details on different permission levels within DTR, see Authentication and authorization in DTR to understand the minimum level required to view the different repository events.
Repository event | Description | Minimum permission level |
---|---|---|
Push | Refers to Create Manifest and Update Tag events. Learn more
about pushing images. |
Authenticated users |
Scan | Requires security scanning to be set
up by a DTR admin.
Once enabled, this will display as a SCAN event type. |
Authenticated users |
Promotion | Refers to a Create Promotion Policy event which links to the
Promotions tab of the repository where you can edit
the existing promotions. See Promotion Policies for different ways to promote
an image. |
Repository admin |
Delete | Refers to “Delete Tag” events. Learn more about Delete images. | Authenticated users |
Pull | Refers to “Get Tag” events. Learn more about Pull an image. | Repository admin |
Mirror | Refers to Pull mirroring and Push mirroring events.
See Mirror images to another registry and
Mirror images from another registry for more details. |
Repository admin |
Create repo | Refers to Create Repository events. See
Create a repository for more details. |
Authenticated users |
Docker Trusted Registry has a global setting for repository event auto-deletion. This allows event records to be removed as part of garbage collection. DTR administrators can enable auto-deletion of repository events in DTR 2.6 based on specified conditions which are covered below.
In your browser, navigate to :samp:https://<dtr-url> and log in with your admin credentials.
Select System from the left navigation pane which displays the Settings page by default.
Scroll down to Repository Events and turn on Auto-Deletion.
Specify the conditions with which an event auto-deletion will be triggered.
DTR allows you to set your auto-deletion conditions based on the following optional repository event attributes:
Name | Description | Example |
---|---|---|
Age | Lets you remove events older than your specified number of hours, days, weeks or months. | 2 months |
Max number of events | Lets you specify the maximum number of events allowed in the repositories. | 6000 |
If you check and specify both, events in your repositories will be removed during garbage collection if either condition is met. You should see a confirmation message right away.
Click Start GC if you’re ready. Read more about garbage collection if you’re unsure about this operation.
Navigate to System > Job Logs to confirm that onlinegc
has happened.
With the introduction of the experimental app plugin to the Docker CLI, DTR has been enhanced to include application management. In DTR 2.7, you can push an app to your DTR repository and have an application be clearly distinguished from individual and multi-architecture container images, as well as plugins. When you push an application to DTR, you see two image tags:
Image | Tag | Type | Under the hood |
---|---|---|---|
Invocation | <app_tag>-invoc |
Container image represented by OS and architecture (e.g.
linux amd64 ) |
Uses Docker Engine. The Docker daemon is responsible for building and pushing the image. |
Application with bundled components | <app_tag> |
Application | Uses the app client to build and push the image. docker app is
experimental on the Docker client. |
Notice the app-specific tags, app
and app-invoc
, with scan
results for the bundled components in the former and the invocation
image in the latter. To view the scanning results for the bundled
components, click “View Details” next to the app
tag.
Click on the image name or digest to see the vulnerabilities for that specific image.
The following repository and image management events also apply to applications:
fixing up "35.165.223.150/admin/lab-words:0.1.0" for push: failed to resolve "35.165.223.150/admin/lab-words:0.1.0-invoc", push the image to the registry before pushing the bundle: failed to do request: Head https://35.165.223.150/v2/admin/lab-words/manifests/0.1.0-invoc: x509: certificate signed by unknown authority
Check that your DTR has been configured with your TLS certificate’s
Fully Qualified Domain Name (FQDN). See Configure
DTR for more details.
For docker app
testing purposes, you can pass the
--insecure-registries
option for pushing an application`.
docker app push hello-world --tag 35.165.223.150/admin/lab-words:0.1.0 --insecure-registries 35.165.223.150
35.165.223.150/admin/lab-words:0.1.0-invoc
Successfully pushed bundle to 35.165.223.150/admin/lab-words:0.1.0. Digest is sha256:bd1a813b6301939fa46e617f96711e0cca1e4065d2d724eb86abde6ef7b18e23.
See DTR 2.7 Release Notes - Known Issues for known issues related to applications in DTR.
Docker Trusted Registry allows you to create and distribute access tokens to enable programmatic access to DTR. Access tokens are linked to a particular user account and duplicate whatever permissions that account has at the time of use. If the account changes permissions, so will the token.
Access tokens are useful in cases such as building integrations since you can issue multiple tokens – one for each integration – and revoke them at any time.
To create an access token for the first time, log in to https://
<dtr-url
with your UCP credentials.
Expand your Profile from the left navigation pane and select Profile > Access Tokens.
Add a description for your token. Specify something which indicates where the token is going to be used, or set a purpose for the token. Administrators can also create tokens for other users.
Once the token is created, you will not be able to see it again. You do have the option to rename, deactivate or delete the token as needed. You can delete the token by selecting it and clicking Delete, or you can click View Details:
You can use an access token anywhere that requires your DTR password. As
an example you can pass your access token to the --password
or
-p
option when logging in from your Docker CLI client:
docker login dtr.example.org --username <username> --password <token>
To use the DTR API to list the repositories your user has access to:
curl --silent --insecure --user <username>:<token> dtr.example.org/api/v0/repositories
Tag pruning is the process of cleaning up unnecessary or unwanted repository tags. As of v2.6, you can configure the Docker Trusted Registry (DTR) to automatically perform tag pruning on repositories that you manage by:
Tag Pruning
When run, tag pruning only deletes a tag and does not carry out any actual blob deletion. For actual blob deletions, see Garbage Collection.
Known Issue
While the tag limit field is disabled when you turn on immutability for a new repository, this is currently not the case with Repository Settings. As a workaround, turn off immutability when setting a tag limit via Repository Settings > Pruning.
In the following section, we will cover how to specify a tag pruning policy and set a tag limit on repositories that you manage. It will not include modifying or deleting a tag pruning policy.
As a repository administrator, you can now add tag pruning policies on
each repository that you manage. To get started, navigate to
https://<dtr-url>
and log in with your credentials.
Select Repositories on the left navigation pane, and then click on
the name of the repository that you want to update. Note that you will
have to click on the repository name following the /
after the
specific namespace for your repository.
Select the Pruning tab, and click New pruning policy to specify your tag pruning criteria:
DTR allows you to set your pruning triggers based on the following image attributes:
Name | Description | Example |
---|---|---|
Tag name | Whether the tag name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Tag name = test` |
Component name | Whether the image has a given component and the component name equals, starts with, ends with, contains, is one of, or is not one of your specified string values | Component name starts with b |
Vulnerabilities | Whether the image has vulnerabilities – critical, major, minor, or all – and your selected vulnerability filter is greater than or equals, greater than, equals, not equals, less than or equals, or less than your specified number | Critical vulnerabilities = 3 |
License | Whether the image uses an intellectual property license and is one of or not one of your specified words | License name = docker |
Last updated at | Whether the last image update was before your specified number of hours, days, weeks, or months. For details on valid time units, see Go’s ParseDuration function | Last updated at: Hours = 12 |
Specify one or more image attributes to add to your pruning criteria, then choose:
Upon selection, you will see a confirmation message and will be redirected to your newly updated Pruning tab.
If you have specified multiple pruning policies on the repository, the Pruning tab will display a list of your prune triggers and details on when the last tag pruning was performed based on the trigger, a toggle for deactivating or reactivating the trigger, and a View link for modifying or deleting your selected trigger.
All tag pruning policies on your account are evaluated every 15 minutes. Any qualifying tags are then deleted from the metadata store. If a tag pruning policy is modified or created, then the tag pruning policy for the affected repository will be evaluated.
In addition to pruning policies, you can also set tag limits on repositories that you manage to restrict the number of tags on a given repository. Repository tag limits are processed in a first in first out (FIFO) manner. For example, if you set a tag limit of 2, adding a third tag would push out the first.
To set a tag limit, do the following:
The CLI tool has commands to install, configure, and backup Docker Trusted Registry (DTR). It also allows uninstalling DTR. By default the tool runs in interactive mode. It prompts you for the values needed.
Additional help is available for each command with the ‘–help’ option.
docker run -it --rm docker/dtr \
command [command options]
If not specified, docker/dtr
uses the latest
tag by default. To
work with a different version, specify it in the command. For example,
docker run -it --rm docker/dtr:2.6.0
.
Create a backup of DTR
docker run -i --rm docker/dtr \
backup [command options] > backup.tar
docker run -i --rm --log-driver none docker/dtr:2.7.5 \
backup --ucp-ca "$(cat ca.pem)" --existing-replica-id 5eb9459a7832 > backup.tar
The following command has been tested on Linux:
DTR_VERSION=$(docker container inspect $(docker container ps -f \
name=dtr-registry -q) | grep -m1 -Po '(?<=DTR_VERSION=)\d.\d.\d'); \
REPLICA_ID=$(docker inspect -f '{{.Name}}' $(docker ps -q -f name=dtr-rethink) | cut -f 3 -d '-'); \
read -p 'ucp-url (The UCP URL including domain and port): ' UCP_URL; \
read -p 'ucp-username (The UCP administrator username): ' UCP_ADMIN; \
read -sp 'ucp password: ' UCP_PASSWORD; \
docker run --log-driver none -i --rm \
--env UCP_PASSWORD=$UCP_PASSWORD \
docker/dtr:$DTR_VERSION backup \
--ucp-username $UCP_ADMIN \
--ucp-url $UCP_URL \
--ucp-ca "$(curl https://${UCP_URL}/ca)" \
--existing-replica-id $REPLICA_ID > \
dtr-metadata-${DTR_VERSION}-backup-$(date +%Y%m%d-%H_%M_%S).tar
This command creates a tar
file with the contents of the volumes
used by DTR, and prints it. You can then use docker/dtr restore
to
restore the data from an existing backup.
Note
This command only creates backups of configurations, and image metadata. It does not back up users and organizations. Users and organizations can be backed up during a UCP backup.
It also does not back up Docker images stored in your registry. You should implement a separate backup policy for the Docker images stored in your registry, taking into consideration whether your DTR installation is configured to store images on the filesystem or is using a cloud provider.
This backup contains sensitive information and should be stored securely.
Using the --offline-backup
flag temporarily shuts down the
RethinkDB container. Take the replica out of your load balancer to
avoid downtime.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--existing-replica-i |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify a DTR replica, you must connect to an existing healthy replica’s database. |
--help-extended |
$$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--offline-backup |
$DTR_OFFLINE_BACKUP | This flag takes RethinkDB down during backup and takes a more reliable backup. If you back up DTR with this flag, RethinkDB will go down during backup. However, offline backups are guaranteed to be more consistent than online backups. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP
TLS CA certificate from https://<ucp-url>/ca , and use --ucp-ca
"$(cat ca.pem)" . |
--ucp-insecure-tl |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation
uses TLS but always trusts the TLS certificate used by UCP, which can
lead to MITM (man-in-the-middle) attacks. For production deployments,
use --ucp-ca "$(cat ca.pem)" instead. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
Destroy a DTR replica’s data
docker run -it --rm docker/dtr \
destroy [command options]
This command forcefully removes all containers and volumes associated with a DTR replica without notifying the rest of the cluster. Use this command on all replicas uninstall DTR.
Use the ‘remove’ command to gracefully scale down your DTR cluster.
Option | Environment variable | Description |
---|---|---|
--replica-id |
$DTR_DESTROY_REPLICA_ID | The ID of the replica to destroy. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--debug |
$DEBUG | Enable debug mode for additional logs. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP.The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to man-in-the-middle attacks. For production deployments, use
--ucp-ca “$(cat ca.pem)” instead. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP.Download the UCP TLS CA
certificate from https:// /ca , and use --ucp-ca "$(cat ca.pem)" . |
Recover DTR from loss of quorum
docker run -it --rm docker/dtr \
emergency-repair [command options]
This command repairs a DTR cluster that has lost quorum by reverting your cluster to a single DTR replica.
There are three steps you can take to recover an unhealthy DTR cluster:
restore
command.When you run this command, a DTR replica of your choice is repaired and
turned into the only replica in the whole DTR cluster. The containers
for all the other DTR replicas are stopped and removed. When using the
force
option, the volumes for these replicas are also deleted.
After repairing the cluster, you should use the join
command to add
more DTR replicas for high availability.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--existing-replica-id |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify DTR, you must connect to an existing healthy replica’s database. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--overlay-subnet |
$DTR_OVERLAY_SUBNET | The subnet used by the dtr-ol overlay network.
Example: 10.0.0.0/24 . For high-availability, DTR creates an overlay
network between UCP nodes. This flag allows you to choose the subnet for
that network. Make sure the subnet you choose is not used on any machine
where DTR replicas are deployed. |
--prune |
$PRUNE | Delete the data volumes of all unhealthy replicas. With this option, the volume of the DTR replica you’re restoring is preserved but the volumes for all other replicas are deleted. This has the same result as completely uninstalling DTR from those replicas. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP
TLS CA certificate from https:// /ca , and use --ucp-ca "$(cat
ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation
uses TLS but always trusts the TLS certificate used by UCP, which can
lead to MITM (man-in-the-middle) attacks. For production deployments,
use --ucp-ca "$(cat ca.pem)" instead. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
--y, yes |
$YES | Answer yes to any prompts. |
List all the images necessary to install DTR
docker run -it --rm docker/dtr \
images [command options]
This command lists all the images necessary to install DTR.
Install Docker Trusted Registry
docker run -it --rm docker/dtr \
install [command options]
This command installs Docker Trusted Registry (DTR) on a node managed by Docker Universal Control Plane (UCP).
After installing DTR, you can join additional DTR replicas using
docker/dtr join
.
$ docker run -it --rm docker/dtr:2.7.5 install \
--ucp-node <UCP_NODE_HOSTNAME> \
--ucp-insecure-tls
Note
Use --ucp-ca "$(cat ca.pem)"
instead of --ucp-insecure-tls
for a production deployment.
Option | Environment variable | Description |
---|---|---|
--async-nfs |
$ASYNC_NFS | Use async NFS volume options on the replica specified in the
--existing-replica-id option. The NFS configuration must be set with
--nfs-storage-url explicitly to use this option. Using
--async-nfs will bring down any containers on the replica that use
the NFS volume, delete the NFS volume, bring it back up with the
appropriate configuration, and restart any containers that were brought
down. |
--client-cert-auth-ca |
$CLIENT_CA | Specify root CA certificates for client authentication with
--client-cert-auth-ca "$(cat ca.pem)" . |
--debug |
$DEBUG | Enable debug mode for additional logs. |
--dtr-ca |
$DTR_CA | Use a PEM-encoded TLS CA certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own root
CA public certificate with --dtr-ca "$(cat ca.pem)" . |
--dtr-cert |
$DTR_CERT | Use a PEM-encoded TLS certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own
public key certificate with --dtr-cert "$(cat cert.pem)" . If the
certificate has been signed by an intermediate certificate authority,
append its public key certificate at the end of the file to establish a
chain of trust. |
--dtr-external-url |
$DTR_EXTERNAL_URL | URL of the host or load balancer clients use to reach DTR. When you use
this flag, users are redirected to UCP for logging in. Once
authenticated they are redirected to the URL you specify in this flag.
If you don’t use this flag, DTR is deployed without single sign-on with
UCP. Users and teams are shared but users log in separately into the two
applications. You can enable and disable single sign-on within your DTR
system settings. Format https://host[:port] , where port is the
value you used with --replica-https-port . Since HSTS (HTTP
Strict-Transport-Security) header is included in all API responses, make
sure to specify the FQDN (Fully Qualified Domain Name) of your DTR, or
your browser may refuse to load the web interface. |
--dtr-key |
$DTR_KEY | Use a PEM-encoded TLS private key for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own TLS
private key with --dtr-key "$(cat key.pem)" . |
--dtr-storage-volume |
$DTR_STORAGE_VOLUME | Customize the volume to store Docker images. By default DTR creates a
volume to store the Docker images in the local filesystem of the node
where DTR is running, without high-availability. Use this flag to
specify a full path or volume name for DTR to store images. For
high-availability, make sure all DTR replicas can read and write data on
this volume. If you’re using NFS, use --nfs-storage-url instead. |
--enable-client-cert-auth |
$ENABLE_CLIENT_CERT_AUTH | Enables TLS client certificate authentication; use
--enable-client-cert-auth=false to disable it. If enabled, DTR will
additionally authenticate users via TLS client certificates. You must
also specify the root certificate authorities (CAs) that issued the
certificates with --client-cert-auth-ca . |
--enable-pprof |
$DTR_PPROF | Enables pprof profiling of the server. Use --enable-pprof=false to
disable it. Once DTR is deployed with this flag, you can access the pprof
endpoint for the api server at /debug/pprof , and the registry
endpoint at /registry_debug_pprof/debug/pprof . |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--http-proxy |
$DTR_HTTP_PROXY | The HTTP proxy used for outgoing requests. |
--https-proxy |
$DTR_HTTPS_PROXY | The HTTPS proxy used for outgoing requests. |
--log-host |
$LOG_HOST | The syslog system to send logs to. The endpoint to send logs to. Use
this flag if you set --log-protocol to tcp or udp . |
--log-level |
$LOG_LEVEL | Log level for all container logs when logging to syslog. Default: INFO. The supported log levels are debug, info, warn, error, or fatal. |
--log-protocol |
$LOG_PROTOCOL | The protocol for sending logs. Default is internal. By default, DTR
internal components log information using the logger specified in the
Docker daemon in the node where the DTR replica is deployed. Use this
option to send DTR logs to an external syslog system. The supported
values are tcp , udp , or internal . Internal is the default
option, stopping DTR from sending logs to an external system. Use this
flag with --log-host . |
--nfs-storage-url |
$NFS_STORAGE_URL | Use NFS to store Docker images following this format: nfs://<ip|
hostname>/<mountpoint> . By default, DTR creates a volume to store the
Docker images in the local filesystem of the node where DTR is running,
without high availability. To use this flag, you need to install an NFS
client library like nfs-common in the node where you’re deploying DTR.
You can test this by running showmount -e <nfs-server> . When you
join new replicas, they will start using NFS so there is no need to
specify this flag. To reconfigure DTR to stop using NFS, leave this
option empty: --nfs-storage-url "" . See USE NFS for more details. |
--nfs-options |
$NFS_OPTIONS | Pass in NFS volume options verbatim for the replica specified in the
--existing-replica-id option. The NFS configuration must be set with
--nfs-storage-url explicitly to use this option. Specifying
--nfs-options will pass in character-for-character the options
specified in the argument when creating or recreating the NFS volume.
For instance, to use NFS v4 with async, pass in “rw,nfsvers=4,async” as
the argument. |
--no-proxy |
$DTR_NO_PROXY | List of domains the proxy should not be used for. When using
--http-proxy you can use this flag to specify a list of domains that
you don’t want to route through the proxy. Format acme.com[, acme.org] . |
--overlay-subnet |
$DTR_OVERLAY_SUBNET | The subnet used by the dtr-ol overlay network. Example: 10.0.0.0/24 .
For high-availability, DTR creates an overlay network between UCP nodes.
This flag allows you to choose the subnet for that network. Make sure
the subnet you choose is not used on any machine where DTR replicas are
deployed. |
--replica-http-port |
$REPLICA_HTTP_PORT | The public HTTP port for the DTR replica. Default is 80 . This allows
you to customize the HTTP port where users can reach DTR. Once users
access the HTTP port, they are redirected to use an HTTPS connection,
using the port specified with --replica-https-port . This port can
also be used for unencrypted health checks. |
--replica-https-port |
$REPLICA_HTTPS_PORT | The public HTTPS port for the DTR replica. Default is 443 . This
allows you to customize the HTTPS port where users can reach DTR. Each
replica can use a different port. |
--replica-id |
$DTR_INSTALL_REPLICA_ID | Assign a 12-character hexadecimal ID to the DTR replica. Random by default. |
--replica-rethinkdb-cache-mb |
$RETHINKDB_CACHE_MB | The maximum amount of space in MB for RethinkDB in-memory cache used by
the given replica. Default is auto. Auto is (available_memory - 1024)
/ 2 . This config allows changing the RethinkDB cache usage per replica.
You need to run it once per replica to change each one. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to MITM
(man-in-the-middle) attacks. For production deployments, use --ucp-ca
"$(cat ca.pem)" instead. |
--ucp-node |
$UCP_NODE | The hostname of the UCP node to deploy DTR. Random by default. You can
find the hostnames of the nodes in the cluster in the UCP web interface,
or by running docker node ls on a UCP manager node. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
Add a new replica to an existing DTR cluster. Use SSH to log into any node that is already part of UCP.
docker run -it --rm \
docker/dtr:2.7.5 join \
--ucp-node <ucp-node-name> \
--ucp-insecure-tls
This command creates a replica of an existing DTR on a node managed by Docker Universal Control Plane (UCP).
For setting DTR for high-availability, create 3, 5, or 7 replicas of DTR.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--existing-replica-id |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify DTR, you must connect to an existing healthy replica’s database. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--replica-http-port |
$REPLICA_HTTP_PORT | The public HTTP port for the DTR replica. Default is 80 . This allows
you to customize the HTTP port where users can reach DTR. Once users
access the HTTP port, they are redirected to use an HTTPS connection,
using the port specified with --replica-https-port . This port can
also be used for unencrypted health checks. |
--replica-https-port |
$REPLICA_HTTPS_PORT | The public HTTPS port for the DTR replica. Default is 443 . This
allows you to customize the HTTPS port where users can reach DTR. Each
replica can use a different port. |
--replica-id |
$DTR_INSTALL_REPLICA_ID | Assign a 12-character hexadecimal ID to the DTR replica. Random by default. |
--replica-rethinkdb-cache-mb |
$RETHINKDB_CACHE_MB | The maximum amount of space in MB for RethinkDB in-memory cache used by
the given replica. Default is auto. Auto is (available_memory - 1024)
/ 2 . This config allows changing the RethinkDB cache usage per
replica. You need to run it once per replica to change each one. |
--skip-network-test |
$DTR_SKIP_NETWORK_TEST | Don’t test if overlay networks are working correctly between UCP nodes. For high-availability, DTR creates an overlay network between UCP nodes and tests that it is working when joining replicas. Don’t use this option for production deployments. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP.Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat
ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to MITM
(man-in-the-middle) attacks. For production deployments, use --ucp-ca
"$(cat ca.pem)" instead. |
--ucp-node |
$UCP_NODE | The hostname of the UCP node to deploy DTR. Random by default.You can find the hostnames of the nodes in the cluster in the UCP web interface, or by running docker node ls on a UCP manager node. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
--unsafe-join |
$DTR_UNSAFE_JOIN | Join a new replica even if the cluster is unhealthy.Joining replicas to an unhealthy DTR cluster leads to split-brain scenarios, and data loss. Don’t use this option for production deployments. |
Change DTR configurations.
docker run -it --rm docker/dtr \
reconfigure [command options]
This command changes DTR configuration settings. If you are using NFS as a storage volume, see Configuring DTR for NFS for details on changes to the reconfiguration process.
DTR is restarted for the new configurations to take effect. To have no down time, configure your DTR for high availability.
Option | Environment variable | Description |
---|---|---|
--async-nfs |
$ASYNC_NFS | Use async NFS volume options on the replica specified in the
--existing-replica-id option. The NFS configuration must be set with
--nfs-storage-url explicitly to use this option. Using
--async-nfs will bring down any containers on the replica that use
the NFS volume, delete the NFS volume, bring it back up with the
appropriate configuration, and restart any containers that were brought
down. |
--client-cert-auth-ca |
$CLIENT_CA | Specify root CA certificates for client authentication with
--client-cert-auth-ca "$(cat ca.pem)" . |
--debug |
$DEBUG | Enable debug mode for additional logs of this bootstrap container (the
log level of downstream DTR containers can be set with --log-level ). |
--dtr-ca |
$DTR_CA | Use a PEM-encoded TLS CA certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own root
CA public certificate with --dtr-ca "$(cat ca.pem)" . |
--dtr-cert |
$DTR_CERT | Use a PEM-encoded TLS certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own
public key certificate with --dtr-cert "$(cat cert.pem)" . If the
certificate has been signed by an intermediate certificate authority,
append its public key certificate at the end of the file to establish a
chain of trust. |
--dtr-external-url |
$DTR_EXTERNAL_URL | URL of the host or load balancer clients use to reach DTR. When you use
this flag, users are redirected to UCP for logging in. Once
authenticated they are redirected to the url you specify in this flag.
If you don’t use this flag, DTR is deployed without single sign-on with
UCP. Users and teams are shared but users login separately into the two
applications. You can enable and disable single sign-on in the DTR
settings. Format https://host[:port] , where port is the value you
used with --replica-https-port . Since HSTS (HTTP
Strict-Transport-Security) header is included in all API responses, make
sure to specify the FQDN (Fully Qualified Domain Name) of your DTR, or
your browser may refuse to load the web interface. |
--dtr-key |
$DTR_KEY | Use a PEM-encoded TLS private key for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own TLS
private key with --dtr-key "$(cat key.pem)" . |
--dtr-storage-volume |
$DTR_STORAGE_VOLUME | Customize the volume to store Docker images. By default DTR creates a
volume to store the Docker images in the local filesystem of the node
where DTR is running, without high-availability. Use this flag to
specify a full path or volume name for DTR to store images. For
high-availability, make sure all DTR replicas can read and write data on
this volume. If you’re using NFS, use --nfs-storage-url instead. |
--enable-client-cert-auth |
$ENABLE_CLIENT_CERT_AUTH | Enables TLS client certificate authentication; use
--enable-client-cert-auth=false to disable it. If enabled, DTR will
additionally authenticate users via TLS client certificates. You must
also specify the root certificate authorities (CAs) that issued the
certificates with --client-cert-auth-ca . |
--enable-pprof |
$DTR_PPROF | Enables pprof profiling of the server. Use --enable-pprof=false to
disable it. Once DTR is deployed with this flag, you can access the
pprof endpoint for the api server at /debug/pprof , and the registry
endpoint at /registry_debug_pprof/debug/pprof . |
--existing-replica-id |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify DTR, you must connect to an existing healthy replica’s database. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--http-proxy |
$DTR_HTTP_PROXY | The HTTP proxy used for outgoing requests. |
--https-proxy |
$DTR_HTTPS_PROXY | The HTTPS proxy used for outgoing requests. |
--log-host |
$LOG_HOST | The syslog system to send logs to. The endpoint to send logs to. Use
this flag if you set --log-protocol to tcp or udp . |
--log-level |
$LOG_LEVEL | Log level for all container logs when logging to syslog. Default: INFO.
The supported log levels are debug , info , warn , error , or fatal . |
--log-protocol |
$LOG_PROTOCOL | The protocol for sending logs. Default is internal. By default, DTR
internal components log information using the logger specified in the
Docker daemon in the node where the DTR replica is deployed. Use this
option to send DTR logs to an external syslog system. The supported
values are tcp , udp , and internal . Internal is the default
option, stopping DTR from sending logs to an external system. Use this flag with --log-host . |
--nfs-storage-url |
$NFS_STORAGE_URL | When running DTR 2.5 (with experimental online garbage
collection) and 2.6.0-2.6.3, there is an issue with reconfiguring and
restoring DTR with --nfs-storage-url which leads to erased tags.
Make sure to back up your DTR metadata before you proceed. To work
around the issue, manually create a storage volume on each DTR node and
reconfigure DTR with --dtr-storage-volume and your newly-created
volume instead. See Reconfigure Using a Local NFS Volume for more
details. To reconfigure DTR to stop using NFS, leave this option empty:
–nfs-storage-url “”. See USE NFS for more details. Upgrade to 2.6.4 and
follow Best practice for data migration in 2.6.4 when switching storage
backends. |
--nfs-options |
$NFS_OPTIONS | Pass in NFS volume options verbatim for the replica specified in the
--existing-replica-id option. The NFS configuration must be set with
--nfs-storage-url explicitly to use this option. Specifying
--nfs-options will pass in character-for-character the options
specified in the argument when creating or recreating the NFS volume.
For instance, to use NFS v4 with async, pass in “rw,nfsvers=4,async” as
the argument. |
--no-proxy |
$DTR_NO_PROXY | List of domains the proxy should not be used for. When using
--http-proxy you can use this flag to specify a list of domains that
you don’t want to route through the proxy. Format acme.com[, acme.org] . |
--replica-http-port |
$REPLICA_HTTP_PORT | The public HTTP port for the DTR replica. Default is 80 . This allows
you to customize the HTTP port where users can reach DTR. Once users
access the HTTP port, they are redirected to use an HTTPS connection,
using the port specified with –replica-https-port. This port can also
be used for unencrypted health checks. |
--replica-https-port |
$REPLICA_HTTPS_PORT | The public HTTPS port for the DTR replica. Default is 443 . This
allows you to customize the HTTPS port where users can reach DTR. Each
replica can use a different port. |
--replica-rethinkdb-cache-mb |
$RETHINKDB_CACHE_MB | The maximum amount of space in MB for RethinkDB in-memory cache used by
the given replica. Default is auto. Auto is (available_memory - 1024)
/ 2 . This config allows changing the RethinkDB cache usage per
replica. You need to run it once per replica to change each one. |
--storage-migrated |
$STORAGE_MIGRATED | A flag added in 2.6.4 which lets you indicate the migration status of your storage data. Specify this flag if you are migrating to a new storage backend and have already moved all contents from your old backend to your new one. If not specified, DTR will assume the new backend is empty during a backend storage switch, and consequently destroy your existing tags and related image metadata. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat ca.pem)" . |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
Remove a DTR replica from a cluster
docker run -it --rm docker/dtr \
remove [command options]
This command gracefully scales down your DTR cluster by removing exactly one replica. All other replicas must be healthy and will remain healthy after this operation.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--existing-replica-id |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify DTR, you must connect to an existing healthy replica’s database. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--replica-id |
$DTR_REMOVE_REPLICA_ID | DEPRECATED Alias for --replica-ids |
--replica-ids |
$DTR_REMOVE_REPLICA_IDS | A comma separated list of IDs of replicas to remove from the cluster. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to MITM
(man-in-the-middle) attacks. For production deployments, use --ucp-ca
"$(cat ca.pem)" instead. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
Install and restore DTR from an existing backup
docker run -i --rm docker/dtr \
restore [command options] < backup.tar
This command performs a fresh installation of DTR, and reconfigures it
with configuration data from a tar
file generated by
`docker/dtr backup
<backup.md>`__. If you are restoring DTR after a
failure, please make sure you have destroyed the old DTR fully.
There are three steps you can take to recover an unhealthy DTR cluster:
restore
command.This command does not restore Docker images. You should implement a separate restore procedure for the Docker images stored in your registry, taking in consideration whether your DTR installation is configured to store images on the local filesystem or using a cloud provider.
After restoring the cluster, you should use the join
command to add
more DTR replicas for high availability.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--dtr-ca |
$DTR_CA | Use a PEM-encoded TLS CA certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own TLS
CA certificate with --dtr-ca "$(cat ca.pem)" . |
--dtr-cert |
$DTR_CERT | Use a PEM-encoded TLS certificate for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own TLS
certificate with --dtr-cert "$(cat ca.pem)" . |
--dtr-external-url |
$DTR_EXTERNAL_URL | URL of the host or load balancer clients use to reach DTR. When you use
this flag, users are redirected to UCP for logging in. Once
authenticated they are redirected to the URL you specify in this flag.
If you don’t use this flag, DTR is deployed without single sign-on with
UCP. Users and teams are shared but users log in separately into the two
applications. You can enable and disable single sign-on within your DTR
system settings. Format https://host[:port] , where port is the value
you used with --replica-https-port . |
--dtr-key |
$DTR_KEY | Use a PEM-encoded TLS private key for DTR. By default DTR generates a
self-signed TLS certificate during deployment. You can use your own TLS
private key with --dtr-key "$(cat ca.pem)" . |
--dtr-storage-volume |
$DTR_STORAGE_VOLUME | Mandatory flag to allow for DTR to fall back to your configured storage setting at the time of backup. If you have previously configured DTR to use a full path or volume name for storage, specify this flag to use the same setting on restore. See docker/dtr install and docker/dtr reconfigure for usage details. |
--dtr-use-default-storage |
$DTR_DEFAULT_STORAGE | Mandatory flag to allow for DTR to fall back to your configured storage backend at the time of backup. If cloud storage was configured, then the default storage on restore is cloud storage. Otherwise, local storage is used. With DTR 2.5 (with experimental online garbage collection) and 2.6.0-2.6.3, this flag must be specified in order to keep your DTR metadata. If you encounter an issue with lost tags, see Restore to Cloud Storage for Docker’s recommended recovery strategy. Upgrade to 2.6.4 and follow Best practice for data migration in 2.6.4 when switching storage backends. |
--nfs-storage-url |
$NFS_STORAGE_URL | Mandatory flag to allow for DTR to fall back to your configured storage
setting at the time of backup. When running DTR 2.5 (with experimental
online garbage collection) and 2.6.0-2.6.3, there is an issue with
reconfiguring and restoring DTR with --nfs-storage-url which leads
to erased tags. Make sure to back up your DTR metadata before you
proceed. If NFS was previously configured, you have to manually create a
storage volume on each DTR node and specify --dtr-storage-volume
with the newly-created volume instead. See Restore to a Local NFS Volume
for more details. For additional NFS configuration options to support
NFS v4, see docker/dtr install and docker/dtr reconfigure. Upgrade
to 2.6.4 and follow Best practice for data migration in 2.6.4 when
switching storage backends. |
--enable-pprof |
$DTR_PPROF | Enables pprof profiling of the server. Use --enable-pprof=false to
disable it. Once DTR is deployed with this flag, you can access the
pprof endpoint for the api server at /debug/pprof , and the registry
endpoint at /registry_debug_pprof/debug/pprof . |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--http-proxy |
$DTR_HTTP_PROXY | The HTTP proxy used for outgoing requests. |
--https-proxy |
$DTR_HTTPS_PROXY | The HTTPS proxy used for outgoing requests. |
--log-host |
$LOG_HOST | The syslog system to send logs to.The endpoint to send logs to. Use this
flag if you set --log-protocol to tcp or udp . |
--log-level |
$LOG_LEVEL | Log level for all container logs when logging to syslog. Default:
INFO . The supported log levels are debug , info , warn ,
error , or fatal . |
--log-protocol |
$LOG_PROTOCOL | The protocol for sending logs. Default is internal.By default, DTR
internal components log information using the logger specified in the
Docker daemon in the node where the DTR replica is deployed. Use this
option to send DTR logs to an external syslog system. The supported
values are tcp, udp, and internal. Internal is the default option,
stopping DTR from sending logs to an external system. Use this flag with --log-host . |
--no-proxy |
$DTR_NO_PROXY | List of domains the proxy should not be used for.When using
--http-proxy you can use this flag to specify a list of domains that
you don’t want to route through the proxy. Format acme.com[, acme.org] . |
--replica-http-port |
$REPLICA_HTTP_PORT | The public HTTP port for the DTR replica. Default is 80 . This allows
you to customize the HTTP port where users can reach DTR. Once users
access the HTTP port, they are redirected to use an HTTPS connection,
using the port specified with --replica-https-port . This port can
also be used for unencrypted health checks. |
--replica-https-port |
$REPLICA_HTTPS_PORT | The public HTTPS port for the DTR replica. Default is 443 . This
allows you to customize the HTTPS port where users can reach DTR. Each
replica can use a different port. |
--replica-id |
$DTR_INSTALL_REPLICA_ID | Assign a 12-character hexadecimal ID to the DTR replica. Random by default. |
--replica-rethinkdb-cache-mb |
$RETHINKDB_CACHE_MB | The maximum amount of space in MB for RethinkDB in-memory cache used by
the given replica. Default is auto. Auto is (available_memory - 1024)
/ 2. This config allows changing the RethinkDB cache usage per
replica. You need to run it once per replica to change each one. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat
ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to MITM
(man-in-the-middle) attacks. For production deployments, use --ucp-ca
"$(cat ca.pem)" instead. |
--ucp-node |
$UCP_NODE | The hostname of the UCP node to deploy DTR. Random by default. You can
find the hostnames of the nodes in the cluster in the UCP web interface,
or by running docker node ls on a UCP manager node. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
Upgrade DTR 2.5.x cluster to this version
docker run -it --rm docker/dtr \
upgrade [command options]
This command upgrades DTR 2.5.x to the current version of this image.
Option | Environment variable | Description |
---|---|---|
--debug |
$DEBUG | Enable debug mode for additional logs. |
--existing-replica-id |
$DTR_REPLICA_ID | The ID of an existing DTR replica. To add, remove or modify DTR, you must connect to an existing healthy replica’s database. |
--help-extended |
$DTR_EXTENDED_HELP | Display extended help text for a given command. |
--ucp-ca |
$UCP_CA | Use a PEM-encoded TLS CA certificate for UCP. Download the UCP TLS CA
certificate from https://<ucp-url>/ca , and use --ucp-ca "$(cat ca.pem)" . |
--ucp-insecure-tls |
$UCP_INSECURE_TLS | Disable TLS verification for UCP. The installation uses TLS but always
trusts the TLS certificate used by UCP, which can lead to MITM
(man-in-the-middle) attacks. For production deployments, use --ucp-ca
"$(cat ca.pem)" instead. |
--ucp-password |
$UCP_PASSWORD | The UCP administrator password. |
--ucp-url |
$UCP_URL | The UCP URL including domain and port. |
--ucp-username |
$UCP_USERNAME | The UCP administrator username. |
This document outlines the functionalities or components within DTR that will be deprecated.
Since v2.5
, it has been possible for repository admins to
autogenerate manifest lists when creating a repository via the
API. You accomplish this by
setting enableManifestLists
to true
when sending a POST request
to the /api/v0/repositories/{namespace}
endpoint. When enabled for a
repository, any image that you push to an existing tag will be appended
to the list of manifests for that tag. enableManifestLists
is set to
false by default, which means pushing a new image to an existing tag
will overwrite the manifest entry for that tag.
The above behavior and the enableManifestLists
field will be removed
in v2.7
. Starting in v2.7
, you can use the DTR CLI to create and
push a manifest list to any repository.
Your Docker Enterprise subscription gives you access to prioritized support. The service levels depend on your subscription.
Before reaching out to support, make sure you’re listed as an authorized support contact for your account. If you’re not listed as an authorized support contact, find a person who is, and ask them to open a case with Docker Support in your behalf.
You can open a new support case at the Docker support page. If you’re unable to submit a new case using the support page, fill in the Docker account support form using your company email address.
Support engineers may ask you to provide a UCP support dump, which is an archive that contains UCP system logs and diagnostic information. If a node is not joined to the cluster and healthy, the support dump from the web UI will not contain logs from the unhealthy node. For unhealthy nodes use the CLI to get a support dump.
To get the support dump from the Web UI:
It may take a few minutes for the download to complete.
To get the support dump from the CLI, use SSH to log into a node and run:
UCP_VERSION=$((docker container inspect ucp-proxy --format '{{index .Config.Labels "com.docker.ucp.version"}}' 2>/dev/null || echo -n 3.2.6)|tr -d [[:space:]]) docker container run --rm \ --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ --log-driver none \ docker/ucp:${UCP_VERSION} \ support > \ docker-support-${HOSTNAME}-$(date +%Y%m%d-%H_%M_%S).tgz
This support dump only contains logs for the node where you’re running the command. If your UCP is highly available, you should collect support dumps from all of the manager nodes.
On Windows worker nodes, run the following command to generate a local support dump:
docker container run --name windowssupport -v 'C:\ProgramData\docker\daemoncerts:C:\ProgramData\docker\daemoncerts' -v 'C:\Windows\system32\winevt\logs:C:\eventlogs:ro' docker/ucp-dsinfo-win:3.2.6; docker cp windowssupport:'C:\dsinfo' .; docker rm -f windowssupport
This command creates a directory named dsinfo in your current directory. If you want an archive file, you need to create it from the dsinfo directory.