Enable the Ceph Prometheus plugin

Enable the Ceph Prometheus plugin

If you have deployed StackLight LMA, you can enhance Ceph monitoring by enabling the Ceph Prometheus plugin that is based on the native Prometheus exporter introduced in Ceph Luminous. In this case, the Ceph Prometheus plugin, instead of Telegraf, collects Ceph metrics providing a wider set of graphs in the Grafana web UI, such as an overview of the Ceph cluster, hosts, OSDs, pools, RADOS gateway nodes, as well as detailed graphs on the Ceph OSD and RADOS Gateway nodes. You can enable the Ceph Prometheus plugin manually on an existing MCP cluster as described below or during the upgrade of StackLight LMA as described in Upgrade StackLight LMA using the Jenkins job.

To enable the Ceph Prometheus plugin manually:

  1. Update the Ceph formula package.

  2. Open your project Git repository with Reclass model on the cluster level.

  3. In classes/cluster/cluster_name/ceph/mon.yml, remove the service.ceph.monitoring.cluster_stats class.

  4. In classes/cluster/cluster_name/ceph/osd.yml, remove the service.ceph.monitoring.node_stats class.

  5. Log in to the Salt Master node.

  6. Refresh grains to set the new alerts and graphs:

    salt '*' state.sls salt.minion.grains
    
  7. Enable the Prometheus plugin:

    salt -C I@ceph:mon state.sls ceph.mgr
    
  8. Update the targets and alerts in Prometheus:

    salt -C 'I@docker:swarm and I@prometheus:server' state.sls prometheus
    
  9. Update the new Grafana dashboards:

    salt -C 'I@grafana:client' state.sls grafana
    
  10. (Optional) Enable the StackLight LMA prediction alerts for Ceph.

    Note

    This feature is available as technical preview. Use such configuration for testing and evaluation purposes only.

    Warning

    This feature is available starting from the MCP 2019.2.3 maintenance update. Before enabling the feature, follow the steps described in Apply maintenance updates.

    1. Open your project Git repository with Reclass model on the cluster level.

    2. In classes/cluster/cluster_name/ceph/common.yml, set enable_prediction to True:

      parameters:
        ceph:
          common:
            enable_prediction: True
      
    3. Log in to the Salt Master node.

    4. Refresh grains to set the new alerts and graphs:

      salt '*' state.sls salt.minion.grains
      
    5. Verify and update the alerts thresholds based on the cluster hardware.

      Note

      For details about tuning the thresholds, contact Mirantis support.

    6. Update the targets and alerts in Prometheus:

      salt -C 'I@docker:swarm and I@prometheus:server' state.sls prometheus
      
  11. Customize Ceph prediction alerts as described in Ceph.