In the MCP 2019.2.4 maintenance update, Mirantis introduces the following enhancements for StackLight LMA:
To obtain the enhancements, follow the steps described in Apply maintenance updates.
Updated Elasticsearch and Kibana from version 5.6.12 to 6.8.0.
Added support for Prometheus Elasticsearch exporter that periodically sends configured queries to the Elasticsearch cluster and exposes the results as Prometheus metrics that you can view in the Prometheus web UI.
Learn more
Added the capability to encrypt the communication between Prometheus and Telegraf as well as Fluentd and Elasticsearch inside an MCP deployment over the Transport Layer Security (TLS) protocol.
Warning
The functionality does not cover encryption of the traffic between HAProxy and Elasticsearch.
Implemented the openstack_nova_instance_status
and
libvirt_domain_info_state
metrics to provide an overview of a VM status
from the OpenStack perspective and state from the libvirt perspective. To view
the metrics, use the Prometheus web UI.
Added the capability for Fluentd to parse the Docker logs and send them to Elasticsearch. Now, you can view the Docker services logs in the Kibana web UI.
Implemented the KPI Downtime and KPI Provisioning
Grafana dashboards as well as the OVSInstanceArpingCheckDown
and
OpencontrailInstancePingCheckDownKey
alerts to provide an overview of the
infrastructure stability based on the following Key Performance Indicator (KPI)
measurements:
compute.instance.create.start
, compute.instance.create.end
, and
compute.instance.create.error
Nova notifications and calculating the KPI
on a daily basis. The measurements reset at midnight.Provides the percentage of downtime check failures. Depending on the MCP cluster configuration, the downtime KPI includes the following measurements:
ERROR
.Learn more
Enhanced the StackLight LMA alerts to provide for a more optimized infrastructure monitoring.
Reconsidered the severities of the RabbitMQ*
alerts and adjusted the
Alertmanager*
and SystemMemory*
alerts.
Restructured and enhanced the alerts documentation to provide alerts customization capabilities and troubleshooting recommendations, as well as list the alerts that require post-deployment tuning according to the deployment configuration.
Added the following alerts:
Removed the inefficient ContrailFLows*
, NovaHypervisor*
,
NovaAggregate*
, NovaTotalVCPUs*
, NovaTotalMemory*
, and
NovaTotalDisk*
, MemcachedServiceRespawn
,
MemcachedItemsNoneMinor
, SystemSwap*
,
PrometheusTargetSamples*
, and PrometheusDataIngestionWarning
alerts.
Note
These alerts will be removed automatically when updating to MCP 2019.2.4. However, if you have modified any of these alerts, you must remove them manually as described in MCP Operations Guide: Manage alerts.
Learn more
Added the capability for Fluentd to handle the OpenStack Cloud Auditing Data Federation (CADF) notifications instead of Heka. Deprecated the Heka service.
If required, you can configure Fluentd running on the RabbitMQ nodes to forward the Cloud Auditing Data Federation (CADF) events to specific external security information and event management (SIEM) systems. For details, see MCP Operations Guide: Enable sending CADF events to external SIEM systems.
To enable CADF notifications handling by Fluentd and remove Heka:
On the cluster level of the Reclass model:
In openstack/message_queue.yml
, add the following class:
- system.fluentd.label.notifications
In stacklight/client.yml
, remove the following class:
- system.docker.swarm.stack.monitoring.remote_collector
In stacklight/server.yml
, remove the Heka classes:
- system.heka.remote_collector.container
- system.heka.remote_collector.input.amqp
- system.heka.remote_collector.output.elasticsearch
- system.heka.remote_collector.output.telegraf
From the Salt Master node:
Update the Fluentd configuration:
salt -C "I@fluentd:agent" state.sls fluentd
Apply the changes:
salt -C "I@docker:swarm:role:master and I@prometheus:server" state.sls docker.client
Remove the Docker service with Heka:
salt -C "I@docker:swarm:role:master and I@prometheus:server" cmd.run 'docker service rm monitoring_remote_collector'