Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!
Now, the MOSK documentation set covers all product layers, including MOSK management (formerly Container Cloud). This means everything you need is in one place. Some legacy names may remain in the code and documentation and will be updated in future releases. The separate Container Cloud documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.
Outbound telemetry¶
Outbound Product Telemetry is a standard architectural component of Mirantis OpenStack for Kubernetes (MOSK) designed to provide Mirantis with the visibility necessary to ensure the health, stability, and ongoing improvement of customer environments. It serves as the primary mechanism for Mirantis to understand product behavior in the field, enabling the team to provide high-quality service and data-driven product development.
The primary goal of product telemetry is to shift from reactive to proactive support by transforming technical data into actionable business value. By analyzing usage patterns, Mirantis can prevent service disruptions by identifying declining utilization or resource exhaustion before these issues impact critical workloads. This visibility also allows Mirantis to optimize hardware performance by correlating specific hardware models and sizing with real-world performance, which in turn provides precise architectural guidance for future deployments.
Furthermore, telemetry data is used to inform the product roadmap, allowing the team to prioritize features and updates based on the actual versions and services currently in use across the entire customer base. Ultimately, this system helps sustain customer value by allowing Mirantis to proactively approach cloud operators to solve workload onboarding issues before they can impact the long-term viability of the cloud infrastructure.
Architecture and data flow¶
The telemetry subsystem is designed as a secure, one-way communication channel from the customer environment to Mirantis. This process begins with the StackLight component in the management cluster, which automatically collects health, usage, and performance metrics from both the management cluster and all managed MOSK clusters. Once collected, the data is processed and aggregated according to predefined rules to ensure that only high-level system metadata is prepared for transmission.
Following aggregation, the telemetry data is pushed through the customer firewall or internet proxy to a secure, encrypted Mirantis telemetry endpoint. For this data flow to function correctly, it is essential that the customer networking team configures the necessary outbound rules to allow traffic to safely reach the designated Mirantis synchronization endpoint.
Data privacy: safe by design¶
Telemetry is explicitly designed to describe the state and performance of the infrastructure, not the content of the data processed within it. No Personally Identifiable Information (PII) or sensitive data is collected.
What is collected¶
The telemetry service collects approximately 150 distinct metrics categorized as follows:
Infrastructure health – names of active firing alerts, node counts, and the availability of core Kubernetes and OpenStack APIs.
Capacity & usage – physical and virtual CPU/RAM capacity, node filesystem size, and total storage requested using Persistent Volume Claims (PVCs).
What is not collected¶
To protect customer privacy and security, the telemetry subsystem is strictly prohibited from collecting:
Tenant data – no application data, customer database content, or virtual machine file content is ever accessed.
Sensitive information – no secrets, passwords, certificates, or encryption keys are part of the collection schema.
Personal data – no names, email addresses, or user-identifiable records are gathered.
Internal network identifiers – no IP addresses or specific hostnames are transmitted; all identifiers are machine-generated and anonymized.
Support and service delivery¶
To provide proactive care and rapid diagnostics, MOSK telemetry serves as the operational foundation for Mirantis Support and Managed Services. MOSK Support Services Exhibit and, specifically, Attachment 1 covering OpsCare Plus state that automated telemetry is a core requirement that establishes a data-driven baseline for technical health of the cloud infrastructure. The data allows Mirantis to perform efficient diagnostics, offer proactive assistance, and verify the environmental conditions necessary for Service Level Agreement (SLA) attainment.
Maintaining an active telemetry collector is a standard operational practice essential for high-quality service delivery. It provides visibility into the environment without affecting support continuity. Any modifications to the configuration of the telemetry subsystem need to be coordinated with Mirantis to ensure the environment remains aligned with standard support prerequisites.
Outbound metrics¶
General cluster information¶
kaas_cluster_info– cluster metadata: type, Kubernetes release, license, underlaycluster_alerts_firing– number of active alerts currently firing in the clustercluster_capacity_cpu_cores– total CPU cores available to the workloads in the clustercluster_capacity_memory_bytes– total RAM (in bytes) available in the clustercluster_filesystem_size_bytes– total storage capacity of the local filesystems on cluster nodescluster_filesystem_usage_bytes– amount of node disk space currently usedcluster_filesystem_usage_ratio– percentage of node disk space currently consumedcluster_master_nodes_total– number of control-plane/master nodescluster_nodes_total– total number of nodes in the clustercluster_persistentvolumeclaim_requests_storage_bytes– total storage space requested by user persistent volumescluster_total_alerts_triggered– total count of unique alert events that have firedcluster_usage_cpu_cores– CPU cores currently used on a nodecluster_usage_memory_bytes– RAM currently used on a nodecluster_usage_per_capacity_cpu_ratio– overall CPU utilization ratiocluster_usage_per_capacity_memory_ratio– overall RAM utilization ratiocluster_worker_nodes_total– number of worker nodes available for user appscluster_workload_containers_total– total number of individual containers runningcluster_workload_pods_total– total number of Kubernetes Pods running
Cluster lifecycle management¶
kaas_cluster_machines_ready_total– count of ready nodes in a specific clusterkaas_cluster_machines_requested_total– number of nodes the user has requested to existkaas_cluster_manager_machines_total– number of nodes with the role of a Kubernetes manager in the clusterkaas_cluster_updating– flag showing if a cluster is in the middle of an upgradekaas_cluster_worker_machines_total– count of worker nodes within a specific managed clusterkaas_info– current version and build info of the KaaS servicekaas_license_expiry– the date and time when the cluster software license expireskaas-events– high-level status and versioning of the KaaS platformkubernetes_api_availability– health check for the core Kubernetes API servermcc_cluster_update_plan_status– readiness/success of maintenance and update plansmcc_collector_error– count of errors within the telemetry collection system itselfhostos_module_usage– tracking of active kernel modules on the host operating system
Hardware information¶
mcc_hw_machine_chassis– the physical form factor, for example, Blade or Rack Mountmcc_hw_machine_cpu_model– the specific model of processor, for example, Intel Xeon Goldmcc_hw_machine_cpu_number– number of physical CPU sockets in the servermcc_hw_machine_nics– details on physical network interface cardsmcc_hw_machine_ram– total physical memory installed in the server hardwaremcc_hw_machine_storage– details of physical local drives (HDD/SSD)mcc_hw_machine_vendor– the hardware manufacturer, for example, Dell, HP, or Supermicromcc_release_controller_state– status of the component managing software releases
Kubernetes underlay (Mirantis Kubernetes Engine)¶
mke_api_availability– availability of the cluster’s underlying Mirantis Kubernetes Engine APImke_cluster_containers_total– count of containers running in Kubernetes underlaymke_cluster_nodes_total– count of nodes in Kubernetes underlaymke_cluster_vcpu_free– remaining vCPU capacity in Kubernetes underlaymke_cluster_vcpu_used– consumed vCPU capacity in Kubernetes underlaymke_cluster_vram_free– remaining RAM capacity in Kubernetes underlaymke_cluster_vram_used– consumed RAM capacity in Kubernetes underlaymke_cluster_vstorage_free– remaining virtual storage in Kubernetes underlaymke_cluster_vstorage_used– consumed virtual storage in Kubernetes underlaynode_labels– labels assigned to the nodes in Kubernetes underlay
OpenStack services¶
openstack_cinder_api_latency_90– response latency of the Block Storage service (OpenStack Cinder) API – 90th percentileopenstack_cinder_api_latency_99– response latency of the Block Storage service (OpenStack Cinder) API – 99th percentileopenstack_cinder_api_status– current health status of the Block Storage service (OpenStack Cinder) API endpointopenstack_cinder_availability– ratio of successful (non-5xx) responses of Block Storage service (OpenStack Cinder) APIopenstack_cinder_volumes_total– total number of volumesopenstack_glance_api_status– current health status of the Image service (OpenStack Glance) APIopenstack_glance_availability– ratio of successful (non-5xx) responses of the Image service (OpenStack Glance) APIopenstack_glance_images_total– total number of imagesopenstack_glance_snapshots_total– total number of backup snapshotsopenstack_heat_availability– ratio of successful (non-5xx) responses of the Orchestration service (OpenStack Heat) APIopenstack_heat_stacks_total– total number of orchestration stacksopenstack_instance_availability– ratio of instances in non-error stateopenstack_instance_create_end– counter of successful instance creationsopenstack_instance_create_error– counter of failed instance creationsopenstack_instance_create_start– counter of attempted instance creationsopenstack_keystone_api_latency_90– response latency of the Identity service (OpenStack Keystone) API – 90th percentileopenstack_keystone_api_latency_99– response latency of the Identity service (OpenStack Keystone) API – 99th percentileopenstack_keystone_api_status– current health status of the Identity service (OpenStack Keystone) APIopenstack_keystone_availability– ratio of successful (non-5xx) responses of the Identity service (OpenStack Keystone) APIopenstack_keystone_tenants_total– number of projectsopenstack_keystone_users_total– total number of registered user accountsopenstack_kpi_provisioning– ratio of successful instance creationsopenstack_lbaas_availability– ratio of load balancers in non-error stateopenstack_mysql_flow_control– health indicator of the OpenStack database clusteropenstack_neutron_api_latency_90– response latency of the Network service (OpenStack Neutron) API – 90th percentileopenstack_neutron_api_latency_99– response latency of the Network service (OpenStack Neutron) API – 99th percentileopenstack_neutron_api_status– current health status of the Network service (OpenStack Neutron) APIopenstack_neutron_availability– ratio of successful (non-5xx) responses of the Network service (OpenStack Neutron) APIopenstack_neutron_lbaas_loadbalancers_total– total number of load balancersopenstack_neutron_networks_total– total number of networksopenstack_neutron_ports_total– total number of network portsopenstack_neutron_routers_total– total number of routersopenstack_neutron_subnets_total– total number of subnetsopenstack_nova_all_compute_cpu_utilisation– global CPU usage percentage of all hypervisorsopenstack_nova_all_compute_mem_utilisation– global RAM usage percentage of all hypervisorsopenstack_nova_all_computes_total– total number of compute nodes in the clusteropenstack_nova_all_disk_total_gb– global capacity of root/ephemeral storage (GB)openstack_nova_all_ram_total_gb– global RAM capacity (GB)openstack_nova_all_used_disk_total_gb– global root/ephemeral storage used (GB)openstack_nova_all_used_ram_total_gb– global RAM used (GB)openstack_nova_all_used_vcpus_total– global vCPUs allocated to VMsopenstack_nova_all_vcpus_total– global vCPU capacityopenstack_nova_api_status– current health status of the Compute service (OpenStack Nova) APIopenstack_nova_availability– ratio of successful (non-5xx) responses from the Compute service (OpenStack Nova) APIopenstack_nova_compute_cpu_utilisation– CPU utilization across all available compute nodes in the clusteropenstack_nova_compute_mem_utilisation– RAM utilization across all available compute nodes in the clusteropenstack_nova_computes_total– number of compute nodes in the clusteropenstack_nova_disk_total_gb– per-server disk capacityopenstack_nova_instances_active_total– total instances currently inActivestateopenstack_nova_ram_total_gb– per-server RAM capacityopenstack_nova_used_disk_total_gb– per-server disk usedopenstack_nova_used_ram_total_gb– per-server RAM usedopenstack_nova_used_vcpus_total– per-server vCPUs allocatedopenstack_nova_vcpus_total– per-server vCPU capacityopenstack_public_api_status– status of the public-facing API endpointsopenstack_quota_instances– maximum allowed VM instances (limit)openstack_quota_ram_gb– maximum allowed RAM (limit)openstack_quota_vcpus– maximum allowed vCPUs (limit)openstack_quota_volume_storage_gb– maximum allowed storage (limit)openstack_rmq_message_deriv– RabbitMQ message total queue depth, number of ready and acknowledged messagesopenstack_usage_instances– current VM instance count vs the limitopenstack_usage_ram_gb– current RAM usage vs the limitopenstack_usage_vcpus– current vCPU count vs the limitopenstack_usage_volume_storage_gb– current storage usage vs the limitosdpl_aodh_alarms– number of alarms in the Alarming service (OpenStack Aodh)osdpl_api_success– availability status of all OpenStack API endpoints, including internal and adminosdpl_cinder_zone_volumes– number of volumes per availability zone in the Block Storage service (OpenStack Cinder)osdpl_ironic_nodes– number of bare-metal servers managed by the Bare Metal service (OpenStack Ironic)osdpl_manila_shares– number of shared filesystems managed by the Shared Filesystem service (OpenStack Manila)osdpl_masakari_hosts– number of hypervisors protected by the Instance HA service (OpenStack Masakari)osdpl_neutron_availability_zone_info– metadata of availability zones defined in the Networking service (OpenStack Neutron)osdpl_neutron_zone_routers– number of routers per availability zone in the Networking service (OpenStack Neutron)osdpl_nova_aggregate_hosts– number of compute nodes per host aggregateosdpl_nova_audit_orphaned_allocations– number of instance records in the Compute service (OpenStack Nova) DB that might be stuck or orphanedosdpl_nova_availability_zone_hosts– number of compute nodes per availability zoneosdpl_nova_availability_zone_info– metadata of availability zones defined in the Compute service (OpenStack Nova)osdpl_nova_availability_zone_instances– number of instances per availability zoneosdpl_version_info– OpenStack version, for example Antelopetf_operator_info– OpenSDN/Tungsten Fabric version