MKE Metrics exposed by Prometheus

MKE Metrics exposed by Prometheus

MKE exports metrics on every node and also exports additional metrics from every controller. The metrics that controllers export are cluster-scoped (for example, the total number of Swarm services), whereas the metrics that nodes export are specific to those nodes (for example, the total memory on that node).

Refer to the table below for detail on the metrics that MKE exposes in Prometheus with a ucp_ label.

Name

Units

Description

Labels

Metric source

ucp_controller_services

Number of services

Total number of Swarm services.

Not applicable

Controller

ucp_engine_container_cpu_percent

Percentage

Percentage of CPU time in use by the container.

Container

Node

ucp_engine_container_cpu_total_time_nanoseconds

Nanoseconds

Total CPU time used by the container.

Container

Node

ucp_engine_container_health

0.0 or 1.0

The container health, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned.

Container

Node

ucp_engine_container_memory_max_usage_bytes

Bytes

Maximum memory in use by the container in bytes.

Container

Node

ucp_engine_container_memory_usage_bytes

Bytes

Current memory in use by the container in bytes.

Container

Node

ucp_engine_container_memory_usage_percent

Percentage

Percentage of total node memory currently in use by the container.

Container

Node

ucp_engine_container_network_rx_bytes_total

Bytes

Number of bytes received by the container over the network in the last sample.

Container networking

Node

ucp_engine_container_network_rx_dropped_packets_total

Number of packets

Number of packets bound for the container over the network that were dropped in the last sample.

Container networking

Node

ucp_engine_container_network_rx_errors_total

Number of errors

Number of received network errors for the container over the network in the last sample.

Container networking

Node

ucp_engine_container_network_rx_packets_total

Number of packets

Number of packets received by the container over the network in the last sample.

Container networking

Node

ucp_engine_container_network_tx_bytes_total

Bytes

Number of bytes sent by the container over the network in the last sample.

Container networking

Node

ucp_engine_container_network_tx_dropped_packets_total

Number of packets

Number of packets sent from the container over the network that were dropped in the last sample.

Container networking

Node

ucp_engine_container_network_tx_errors_total

Number of errors

Number of sent network errors for the container on the network in the last sample.

Container networking

Node

ucp_engine_container_network_tx_packets_total

Number of packets

Number of sent packets for the container over the network in the last sample.

Container networking

Node

ucp_engine_container_unhealth

0.0 or 1.0

Indicates whether the container is healthy, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned.

Container

Node

ucp_engine_containers

Number of containers

Total number of containers on the node.

Node

Node

ucp_engine_cpu_total_time_nanoseconds

Nanoseconds

System CPU time used by the container.

Container

Node

ucp_engine_disk_free_bytes

Bytes

Free disk space on the Docker root directory on the node, in bytes. This metric is not available for Windows nodes.

Node

Node

ucp_engine_disk_total_bytes

Bytes

Total disk space on the Docker root directory on this node in bytes. Note that the ucp_engine_disk_free_bytes metric is not available for Windows nodes.

Node

Node

ucp_engine_images

Number of images

Total number of images on the node.

Node

Node

ucp_engine_memory_total_bytes

Bytes

Total amount of memory on the node.

Node

Node

ucp_engine_networks

Number of networks

Total number of networks on the node.

Node

Node

ucp_engine_node_health

0.0 or 1.0

Health status of the node, as determined by MKE.

nodeName: node name, nodeAddr: node IP address

Controller

ucp_engine_num_cpu_cores

Number of cores

Number of CPU cores on the node.

Node

Node

ucp_engine_pod_container_ready

0.0 or 1.0

Readiness of the container in a Kubernetes pod, as determined by its readiness probe.

Pod

Controller

ucp_engine_pod_ready

0.0 or 1.0

Readiness of the container in a Kubernetes pod, as determined by its readiness probe.

Pod

Controller

ucp_engine_volumes

Number of volumes

Total number of volumes on the node.

Node

Node