Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!
Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly MCC). This means everything you need is in one place. The separate MCC documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.
Infrastructure connectivity monitoring¶
Available since MCC 2.30.0 (21.0.0 and 20.0.0)
Clusters may occasionally encounter issues due to failures in the underlying infrastructure, such as network switches, and data center networking fabric. Monitoring the cluster underlay networking allows the operator to:
Reduce the time spent diagnosing outages and restoring services
Correlate major software stack issues with the underlying infrastructure outages and visualize this data
Identify the weakest points in the software stack and improve product resilience
Using the InfraConnectivityMonitor
resource, you can enable network
infrastructure monitoring to monitor network connectivity between management or
MOSK cluster nodes using ping checks. The operator can use
the monitoring data to correlate the major software stack issues with the
underlying infrastructure outages as well as identify the weakest points of the
software stack to make it more resilient.
Once the required objects are created, the NetChecker controller generates a
configuration for network checker based on Machine
, L2Template
, and
Subnet
objects related to the target cluster and starts ping checks for
each target node.
StackLight is pre-configured to collect metrics from monitoring agents and
includes CnncAgentDown
and CnncNodeDown
alerts. It also contains the
dedicated Grafana Infra - Cross Node Connectivity dashboard to
provide information about cross-node network connectivity checks.
Caution
ICMP protocol for ping checks must be allowed within the whole network infrastructure including all baremetal hosts, switches, and routers in-between racks, if any.
The stack of infrastructure connectivity monitoring (NetChecker) consists of the following components.
On a management cluster:
netchecker-controller
Manages the whole infrastructure connectivity monitoring configuration once the operator creates the
InfraConnectivityMonitor
object on the management cluster. The NetChecker controller creates and manages theCheckerInventoryConfig
andNetCheckerTargetsConfig
objects on the target MOSK cluster.The
InfraConnectivityMonitor
object updates the configuration of theCheckerInventoryConfig
andNetCheckerTargetsConfig
objects according to the state ofMachine
,L2Template
, andSubnet
objects of the target cluster.Also, the NetChecker controller monitors the state of related
Machine
objects. If a machine is moved to theMaintenance
mode, becomesDisabled
, or is scheduled for deletion, the corresponding Kubernetes node is automatically removed from the NetChecker configuration with the reason provided in the status of theInfraConnectivityMonitor
object.For more details on monitoring resources described above, see API Reference: InfraConnectivityMonitor, CheckerInventoryConfig, and NetCheckerTargetsConfig.
On a target MOSK cluster:
Note
By default, the following components are always present in the
netchecker
namespace in new and existing clusters.
cnnc-inventory-agent
Periodically runs on each machine and collects IP addresses along with other meta information for the defined list of subnets. Updates the
CheckerInventoryConfig
status on each run.cnnc-inventory-controller
Monitors and manages the
CheckerInventoryConfig
status.Prepares the configuration for
cnnc-inventory-agent
per node and analyzes output results of the node information collected bycnnc-inventory-agent
.Verifies whether synchronization between
cnnc-inventory-agent
andCheckerInventoryConfig
is actual. If it does not detect target machines or their IP addresses from defined subnets, or in case of any misconfiguration,cnnc-inventory-controller
propagates errors in theCheckerInventoryConfig
status.
cnnc-netchecker-controller
Monitors and manages statuses of
CheckerInventoryConfig
andNetCheckerTargetsConfig
objects on the target cluster.Propagates errors in statuses of
CheckerInventoryConfig
andNetCheckerTargetsConfig
objects in case of inventory issues.Prepares configuration for
cnnc-agent
to execute ping checks for IP addresses defined for each node with dedicated subnets.
cnnc-agent
Runs on each node to execute ping checks according to the configuration prepared by
cnnc-netchecker-controller
.