StackLight LMA overview

StackLight LMA overview

StackLight LMA monitors nodes, services, cluster health, and provides reach operational insights out-of-the-box for OpenStack, Kubernetes, and OpenContrail services deployed on the platform. Stacklight LMA helps to prevent critical conditions in the MCP cluster by sending notifications to cloud operators so that they can take timely actions to eliminate the risk of service downtime. Stacklight LMA uses the following tools to gather monitoring metrics:

  • Telegraf, a plugin-driven server agent that monitors the nodes on which the MCP cluster components are deployed. Telegraf gathers basic operating system metrics, including:

    • CPU

    • Memory

    • Disk

    • Disk I/O

    • System

    • Processes

    • Docker

  • Prometheus, a toolkit that gathers metrics. Each Prometheus instance automatically discovers and monitors a number of endpoints such as Kubernetes, etcd, Calico, Telegraf, and others. For Kubernetes deployments, Prometheus discovers the following endpoints:

    • Node, discovers one target per cluster node.

    • Service, discovers a target for each service port.

    • Pod, discovers all pods and exposes their containers as targets.

    • Endpoint, discovers targets from listed endpoints of a service.

    By default, the Prometheus database stores metrics of the past 15 days. To store the data in a long-term perspective, consider one of the following options:

    • (Default) Prometheus long-term storage, which uses the federated Prometheus to store the metrics (six months)

    • InfluxDB, which uses the remote storage adapter to store the metrics (30 days)


      InfluxDB, including InfluxDB Relay and remote storage adapter, is deprecated in the Q4`18 MCP release and will be removed in the next release.

    Using the Prometheus web UI, you can view simple visualizations, debug, add new features such as alerts, aggregates, and others. Grafana dashboards provide a visual representation of all graphs gathered by Telegraf and Prometheus.

See also