Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!

Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly MCC). This means everything you need is in one place. The separate MCC documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.

Infrastructure connectivity monitoring

Available since MCC 2.30.0 (21.0.0 and 20.0.0)

Clusters may occasionally encounter issues due to failures in the underlying infrastructure, such as network switches, and data center networking fabric. Monitoring the cluster underlay networking allows the operator to:

  • Reduce the time spent diagnosing outages and restoring services

  • Correlate major software stack issues with the underlying infrastructure outages and visualize this data

  • Identify the weakest points in the software stack and improve product resilience

Using the InfraConnectivityMonitor resource, you can enable network infrastructure monitoring to monitor network connectivity between management or MOSK cluster nodes using ping checks. The operator can use the monitoring data to correlate the major software stack issues with the underlying infrastructure outages as well as identify the weakest points of the software stack to make it more resilient.

Once the required objects are created, the NetChecker controller generates a configuration for network checker based on Machine, L2Template, and Subnet objects related to the target cluster and starts ping checks for each target node.

StackLight is pre-configured to collect metrics from monitoring agents and includes CnncAgentDown and CnncNodeDown alerts. It also contains the dedicated Grafana Infra - Cross Node Connectivity dashboard to provide information about cross-node network connectivity checks.

Caution

ICMP protocol for ping checks must be allowed within the whole network infrastructure including all baremetal hosts, switches, and routers in-between racks, if any.

The stack of infrastructure connectivity monitoring (NetChecker) consists of the following components.

On a management cluster:

netchecker-controller

Manages the whole infrastructure connectivity monitoring configuration once the operator creates the InfraConnectivityMonitor object on the management cluster. The NetChecker controller creates and manages the CheckerInventoryConfig and NetCheckerTargetsConfig objects on the target MOSK cluster.

The InfraConnectivityMonitor object updates the configuration of the CheckerInventoryConfig and NetCheckerTargetsConfig objects according to the state of Machine, L2Template, and Subnet objects of the target cluster.

Also, the NetChecker controller monitors the state of related Machine objects. If a machine is moved to the Maintenance mode, becomes Disabled, or is scheduled for deletion, the corresponding Kubernetes node is automatically removed from the NetChecker configuration with the reason provided in the status of the InfraConnectivityMonitor object.

For more details on monitoring resources described above, see API Reference: InfraConnectivityMonitor, CheckerInventoryConfig, and NetCheckerTargetsConfig.

On a target MOSK cluster:

Note

By default, the following components are always present in the netchecker namespace in new and existing clusters.

cnnc-inventory-agent

Periodically runs on each machine and collects IP addresses along with other meta information for the defined list of subnets. Updates the CheckerInventoryConfig status on each run.

cnnc-inventory-controller
  • Monitors and manages the CheckerInventoryConfig status.

  • Prepares the configuration for cnnc-inventory-agent per node and analyzes output results of the node information collected by cnnc-inventory-agent.

  • Verifies whether synchronization between cnnc-inventory-agent and CheckerInventoryConfig is actual. If it does not detect target machines or their IP addresses from defined subnets, or in case of any misconfiguration, cnnc-inventory-controller propagates errors in the CheckerInventoryConfig status.

cnnc-netchecker-controller
  • Monitors and manages statuses of CheckerInventoryConfig and NetCheckerTargetsConfig objects on the target cluster.

  • Propagates errors in statuses of CheckerInventoryConfig and NetCheckerTargetsConfig objects in case of inventory issues.

  • Prepares configuration for cnnc-agent to execute ping checks for IP addresses defined for each node with dedicated subnets.

cnnc-agent

Runs on each node to execute ping checks according to the configuration prepared by cnnc-netchecker-controller.