Tune Prometheus IPMI exporter¶
In MOSK, IPMI monitoring is provided by the Prometheus IPMI exporter. The exporter is enabled by default on management clusters, collects hardware telemetry from server Baseboard Management Controller (BMC) endpoints, and exposes it as Prometheus metrics. It monitors hosts in both management and MOSK clusters. For architecture details, see Deployment architecture: Prometheus IPMI exporter.
Note
The IPMI exporter monitors only BareMetalHost objects with
IPMI-based BMC addresses, using the ipmi:// scheme or no scheme, which
defaults to IPMI. Hosts configured with other BMC protocols, such as
redfish://, are excluded from IPMI monitoring.
The procedures below describe how to exclude specific clusters or hosts from IPMI monitoring and how to add custom alerts and Grafana dashboards. For global IPMI exporter configuration, collector settings, and to disable Prometheus IPMI exporter entirely, see Prometheus IPMI exporter.
Disable IPMI monitoring for hosts or clusters¶
IPMI monitoring can be disabled at two levels: for entire clusters or for individual hosts. When disabled at the cluster level, all hosts belonging to that cluster are excluded from IPMI monitoring. Host‑level exclusions take effect only when cluster‑level exclusion is not enabled.
To disable IPMI monitoring for an entire cluster:
Open the
Clusterobject for editing.In
spec.providerSpec.value, setdisableIPMIMonitoring: true.Save the
Clusterobject to apply the change.
This disables IPMI monitoring for all hosts in the cluster.
To disable IPMI monitoring for a host:
Open the
BareMetalHostInventoryobject for the specific host you want to exclude:kubectl -n <project-name> edit baremetalhostinventory <host-name>
Add the
kaas.mirantis.com/disable-ipmi-monitoring="true"annotation.Save the
BareMetalHostInventoryobject to apply the change.
This disables IPMI monitoring for that host only.
Note
The cluster-level setting takes precedence over host-level annotations. If a cluster has IPMI monitoring disabled, individual host annotations are ignored.
Create custom alerts and Grafana dashboards¶
IPMI monitoring includes preconfigured Grafana dashboards and Prometheus alert rules. They offer built-in visibility into common hardware health scenarios and can be customized as needed.
You can extend IPMI monitoring by creating custom Prometheus alerts and Grafana dashboards:
To create custom Grafana dashboards, see Create custom dashboards in Grafana.
To configure custom alerts use
prometheusServer.customAlertsin your StackLight configuration. For details, see Alert configuration.Examples of custom alerts:
Temperature thresholds (warning at 70°C, critical at 80°C)
Fan speed below minimum thresholds
Voltage outside acceptable ranges
Power consumption exceeding capacity
Chassis power state changes
Power supply failures