Searching for results...

No results

Your search did not match anything from Mirantis documentation.
Check your spelling or try different keywords.

An error occurred

An error occurred while using the search.
Try your search again or contact us to let us know about it.

Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!

Now, the MOSK documentation set covers all product layers, including MOSK management (formerly Container Cloud). This means everything you need is in one place. Some legacy names may remain in the code and documentation and will be updated in future releases. The separate Container Cloud documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.

Outbound telemetry

Outbound Product Telemetry is a standard architectural component of Mirantis OpenStack for Kubernetes (MOSK) designed to provide Mirantis with the visibility necessary to ensure the health, stability, and ongoing improvement of customer environments. It serves as the primary mechanism for Mirantis to understand product behavior in the field, enabling the team to provide high-quality service and data-driven product development.

The primary goal of product telemetry is to shift from reactive to proactive support by transforming technical data into actionable business value. By analyzing usage patterns, Mirantis can prevent service disruptions by identifying declining utilization or resource exhaustion before these issues impact critical workloads. This visibility also allows Mirantis to optimize hardware performance by correlating specific hardware models and sizing with real-world performance, which in turn provides precise architectural guidance for future deployments.

Furthermore, telemetry data is used to inform the product roadmap, allowing the team to prioritize features and updates based on the actual versions and services currently in use across the entire customer base. Ultimately, this system helps sustain customer value by allowing Mirantis to proactively approach cloud operators to solve workload onboarding issues before they can impact the long-term viability of the cloud infrastructure.

Architecture and data flow

The telemetry subsystem is designed as a secure, one-way communication channel from the customer environment to Mirantis. This process begins with the StackLight component in the management cluster, which automatically collects health, usage, and performance metrics from both the management cluster and all managed MOSK clusters. Once collected, the data is processed and aggregated according to predefined rules to ensure that only high-level system metadata is prepared for transmission.

Following aggregation, the telemetry data is pushed through the customer firewall or internet proxy to a secure, encrypted Mirantis telemetry endpoint. For this data flow to function correctly, it is essential that the customer networking team configures the necessary outbound rules to allow traffic to safely reach the designated Mirantis synchronization endpoint.

Data privacy: safe by design

Telemetry is explicitly designed to describe the state and performance of the infrastructure, not the content of the data processed within it. No Personally Identifiable Information (PII) or sensitive data is collected.

What is collected

The telemetry service collects approximately 150 distinct metrics categorized as follows:

  • Infrastructure health – names of active firing alerts, node counts, and the availability of core Kubernetes and OpenStack APIs.

  • Capacity & usage – physical and virtual CPU/RAM capacity, node filesystem size, and total storage requested using Persistent Volume Claims (PVCs).

What is not collected

To protect customer privacy and security, the telemetry subsystem is strictly prohibited from collecting:

  • Tenant data – no application data, customer database content, or virtual machine file content is ever accessed.

  • Sensitive information – no secrets, passwords, certificates, or encryption keys are part of the collection schema.

  • Personal data – no names, email addresses, or user-identifiable records are gathered.

  • Internal network identifiers – no IP addresses or specific hostnames are transmitted; all identifiers are machine-generated and anonymized.

Support and service delivery

To provide proactive care and rapid diagnostics, MOSK telemetry serves as the operational foundation for Mirantis Support and Managed Services. MOSK Support Services Exhibit and, specifically, Attachment 1 covering OpsCare Plus state that automated telemetry is a core requirement that establishes a data-driven baseline for technical health of the cloud infrastructure. The data allows Mirantis to perform efficient diagnostics, offer proactive assistance, and verify the environmental conditions necessary for Service Level Agreement (SLA) attainment.

Maintaining an active telemetry collector is a standard operational practice essential for high-quality service delivery. It provides visibility into the environment without affecting support continuity. Any modifications to the configuration of the telemetry subsystem need to be coordinated with Mirantis to ensure the environment remains aligned with standard support prerequisites.

Outbound metrics

General cluster information

  • kaas_cluster_info – cluster metadata: type, Kubernetes release, license, underlay

  • cluster_alerts_firing – number of active alerts currently firing in the cluster

  • cluster_capacity_cpu_cores – total CPU cores available to the workloads in the cluster

  • cluster_capacity_memory_bytes – total RAM (in bytes) available in the cluster

  • cluster_filesystem_size_bytes – total storage capacity of the local filesystems on cluster nodes

  • cluster_filesystem_usage_bytes – amount of node disk space currently used

  • cluster_filesystem_usage_ratio – percentage of node disk space currently consumed

  • cluster_master_nodes_total – number of control-plane/master nodes

  • cluster_nodes_total – total number of nodes in the cluster

  • cluster_persistentvolumeclaim_requests_storage_bytes – total storage space requested by user persistent volumes

  • cluster_total_alerts_triggered – total count of unique alert events that have fired

  • cluster_usage_cpu_cores – CPU cores currently used on a node

  • cluster_usage_memory_bytes – RAM currently used on a node

  • cluster_usage_per_capacity_cpu_ratio – overall CPU utilization ratio

  • cluster_usage_per_capacity_memory_ratio – overall RAM utilization ratio

  • cluster_worker_nodes_total – number of worker nodes available for user apps

  • cluster_workload_containers_total – total number of individual containers running

  • cluster_workload_pods_total – total number of Kubernetes Pods running

Cluster lifecycle management

  • kaas_cluster_machines_ready_total – count of ready nodes in a specific cluster

  • kaas_cluster_machines_requested_total – number of nodes the user has requested to exist

  • kaas_cluster_manager_machines_total – number of nodes with the role of a Kubernetes manager in the cluster

  • kaas_cluster_updating – flag showing if a cluster is in the middle of an upgrade

  • kaas_cluster_worker_machines_total – count of worker nodes within a specific managed cluster

  • kaas_info – current version and build info of the KaaS service

  • kaas_license_expiry – the date and time when the cluster software license expires

  • kaas-events – high-level status and versioning of the KaaS platform

  • kubernetes_api_availability – health check for the core Kubernetes API server

  • mcc_cluster_update_plan_status – readiness/success of maintenance and update plans

  • mcc_collector_error – count of errors within the telemetry collection system itself

  • hostos_module_usage – tracking of active kernel modules on the host operating system

Hardware information

  • mcc_hw_machine_chassis – the physical form factor, for example, Blade or Rack Mount

  • mcc_hw_machine_cpu_model – the specific model of processor, for example, Intel Xeon Gold

  • mcc_hw_machine_cpu_number – number of physical CPU sockets in the server

  • mcc_hw_machine_nics – details on physical network interface cards

  • mcc_hw_machine_ram – total physical memory installed in the server hardware

  • mcc_hw_machine_storage – details of physical local drives (HDD/SSD)

  • mcc_hw_machine_vendor – the hardware manufacturer, for example, Dell, HP, or Supermicro

  • mcc_release_controller_state – status of the component managing software releases

Kubernetes underlay (Mirantis Kubernetes Engine)

  • mke_api_availability – availability of the cluster’s underlying Mirantis Kubernetes Engine API

  • mke_cluster_containers_total – count of containers running in Kubernetes underlay

  • mke_cluster_nodes_total – count of nodes in Kubernetes underlay

  • mke_cluster_vcpu_free – remaining vCPU capacity in Kubernetes underlay

  • mke_cluster_vcpu_used – consumed vCPU capacity in Kubernetes underlay

  • mke_cluster_vram_free – remaining RAM capacity in Kubernetes underlay

  • mke_cluster_vram_used – consumed RAM capacity in Kubernetes underlay

  • mke_cluster_vstorage_free – remaining virtual storage in Kubernetes underlay

  • mke_cluster_vstorage_used – consumed virtual storage in Kubernetes underlay

  • node_labels – labels assigned to the nodes in Kubernetes underlay

OpenStack services

  • openstack_cinder_api_latency_90 – response latency of the Block Storage service (OpenStack Cinder) API – 90th percentile

  • openstack_cinder_api_latency_99 – response latency of the Block Storage service (OpenStack Cinder) API – 99th percentile

  • openstack_cinder_api_status – current health status of the Block Storage service (OpenStack Cinder) API endpoint

  • openstack_cinder_availability – ratio of successful (non-5xx) responses of Block Storage service (OpenStack Cinder) API

  • openstack_cinder_volumes_total – total number of volumes

  • openstack_glance_api_status – current health status of the Image service (OpenStack Glance) API

  • openstack_glance_availability – ratio of successful (non-5xx) responses of the Image service (OpenStack Glance) API

  • openstack_glance_images_total – total number of images

  • openstack_glance_snapshots_total – total number of backup snapshots

  • openstack_heat_availability – ratio of successful (non-5xx) responses of the Orchestration service (OpenStack Heat) API

  • openstack_heat_stacks_total – total number of orchestration stacks

  • openstack_instance_availability – ratio of instances in non-error state

  • openstack_instance_create_end – counter of successful instance creations

  • openstack_instance_create_error – counter of failed instance creations

  • openstack_instance_create_start – counter of attempted instance creations

  • openstack_keystone_api_latency_90 – response latency of the Identity service (OpenStack Keystone) API – 90th percentile

  • openstack_keystone_api_latency_99 – response latency of the Identity service (OpenStack Keystone) API – 99th percentile

  • openstack_keystone_api_status – current health status of the Identity service (OpenStack Keystone) API

  • openstack_keystone_availability – ratio of successful (non-5xx) responses of the Identity service (OpenStack Keystone) API

  • openstack_keystone_tenants_total – number of projects

  • openstack_keystone_users_total – total number of registered user accounts

  • openstack_kpi_provisioning – ratio of successful instance creations

  • openstack_lbaas_availability – ratio of load balancers in non-error state

  • openstack_mysql_flow_control – health indicator of the OpenStack database cluster

  • openstack_neutron_api_latency_90 – response latency of the Network service (OpenStack Neutron) API – 90th percentile

  • openstack_neutron_api_latency_99 – response latency of the Network service (OpenStack Neutron) API – 99th percentile

  • openstack_neutron_api_status – current health status of the Network service (OpenStack Neutron) API

  • openstack_neutron_availability – ratio of successful (non-5xx) responses of the Network service (OpenStack Neutron) API

  • openstack_neutron_lbaas_loadbalancers_total – total number of load balancers

  • openstack_neutron_networks_total – total number of networks

  • openstack_neutron_ports_total – total number of network ports

  • openstack_neutron_routers_total – total number of routers

  • openstack_neutron_subnets_total – total number of subnets

  • openstack_nova_all_compute_cpu_utilisation – global CPU usage percentage of all hypervisors

  • openstack_nova_all_compute_mem_utilisation – global RAM usage percentage of all hypervisors

  • openstack_nova_all_computes_total – total number of compute nodes in the cluster

  • openstack_nova_all_disk_total_gb – global capacity of root/ephemeral storage (GB)

  • openstack_nova_all_ram_total_gb – global RAM capacity (GB)

  • openstack_nova_all_used_disk_total_gb – global root/ephemeral storage used (GB)

  • openstack_nova_all_used_ram_total_gb – global RAM used (GB)

  • openstack_nova_all_used_vcpus_total – global vCPUs allocated to VMs

  • openstack_nova_all_vcpus_total – global vCPU capacity

  • openstack_nova_api_status – current health status of the Compute service (OpenStack Nova) API

  • openstack_nova_availability – ratio of successful (non-5xx) responses from the Compute service (OpenStack Nova) API

  • openstack_nova_compute_cpu_utilisation – CPU utilization across all available compute nodes in the cluster

  • openstack_nova_compute_mem_utilisation – RAM utilization across all available compute nodes in the cluster

  • openstack_nova_computes_total – number of compute nodes in the cluster

  • openstack_nova_disk_total_gb – per-server disk capacity

  • openstack_nova_instances_active_total – total instances currently in Active state

  • openstack_nova_ram_total_gb – per-server RAM capacity

  • openstack_nova_used_disk_total_gb – per-server disk used

  • openstack_nova_used_ram_total_gb – per-server RAM used

  • openstack_nova_used_vcpus_total – per-server vCPUs allocated

  • openstack_nova_vcpus_total – per-server vCPU capacity

  • openstack_public_api_status – status of the public-facing API endpoints

  • openstack_quota_instances – maximum allowed VM instances (limit)

  • openstack_quota_ram_gb – maximum allowed RAM (limit)

  • openstack_quota_vcpus – maximum allowed vCPUs (limit)

  • openstack_quota_volume_storage_gb – maximum allowed storage (limit)

  • openstack_rmq_message_deriv – RabbitMQ message total queue depth, number of ready and acknowledged messages

  • openstack_usage_instances – current VM instance count vs the limit

  • openstack_usage_ram_gb – current RAM usage vs the limit

  • openstack_usage_vcpus – current vCPU count vs the limit

  • openstack_usage_volume_storage_gb – current storage usage vs the limit

  • osdpl_aodh_alarms – number of alarms in the Alarming service (OpenStack Aodh)

  • osdpl_api_success – availability status of all OpenStack API endpoints, including internal and admin

  • osdpl_cinder_zone_volumes – number of volumes per availability zone in the Block Storage service (OpenStack Cinder)

  • osdpl_ironic_nodes – number of bare-metal servers managed by the Bare Metal service (OpenStack Ironic)

  • osdpl_manila_shares – number of shared filesystems managed by the Shared Filesystem service (OpenStack Manila)

  • osdpl_masakari_hosts – number of hypervisors protected by the Instance HA service (OpenStack Masakari)

  • osdpl_neutron_availability_zone_info – metadata of availability zones defined in the Networking service (OpenStack Neutron)

  • osdpl_neutron_zone_routers – number of routers per availability zone in the Networking service (OpenStack Neutron)

  • osdpl_nova_aggregate_hosts – number of compute nodes per host aggregate

  • osdpl_nova_audit_orphaned_allocations – number of instance records in the Compute service (OpenStack Nova) DB that might be stuck or orphaned

  • osdpl_nova_availability_zone_hosts – number of compute nodes per availability zone

  • osdpl_nova_availability_zone_info – metadata of availability zones defined in the Compute service (OpenStack Nova)

  • osdpl_nova_availability_zone_instances – number of instances per availability zone

  • osdpl_version_info – OpenStack version, for example Antelope

  • tf_operator_info – OpenSDN/Tungsten Fabric version