Verify StackLight after configuration

Verify StackLight after configuration

This section describes how to verify StackLight after configuring its parameters as described in Configure StackLight and StackLight configuration parameters. Perform the verification procedure described for a particular modified StackLight key.

To verify StackLight after configuration:

Key

Verification procedure

alerta.enabled

Verify that Alerta is present in the list of StackLight resources. An empty output indicates that Alerta is disabled.

kubectl get all -n stacklight -l app=alerta

elasticsearch.logstashRetentionTime

Verify that the unit_count parameter contains the desired number of days:

kubectl get cm elasticsearch-curator-config -n \
stacklight -o=jsonpath='{.data.action_file\.yml}'

grafana.renderer.enabled

Verify the Grafana Image Renderer. If set to true, the output should include HTTP Server started, listening at http://localhost:8081.

kubectl logs -f -n stacklight -l app=grafana --container grafana-renderer

grafana.homeDashboard

In the Grafana web UI, verify that the desired dashboard is set as a home dashboard.

logging.enabled

Verify that Elasticsearch, Fluentd, and Kibana are present in the list of StackLight resources. An empty output indicates that the StackLight logging stack is disabled.

kubectl get all -n stacklight -l 'app in
(elasticsearch-master,kibana,fluentd-elasticsearch)'

highAvailabilityEnabled

Run kubectl get sts -n stacklight. The output includes the number of services replicas for the HA or non-HA StackLight modes. For details, see StackLight deployment architecture.

metricCollector.enabled

Verify that metric collector is present in the list of StackLight resources. An empty output indicates that metric collector is disabled.

kubectl get all -n stacklight -l app=metric-collector
  • prometheusServer.retentionTime

  • prometheusServer.retentionSize

  • prometheusServer.alertResendDelay

  1. In the Prometheus web UI, navigate to Status > Command-Line Flags.

  2. Verify the values for the following flags:

    • storage.tsdb.retention.time

    • storage.tsdb.retention.size

    • rules.alert.resend-delay

  • clusterSize

  • resourcesPerClusterSize

  • resources

  1. Obtain the list of pods:

    kubectl get po -n stacklight
    
  2. Verify that the desired resource limits or requests are set in the resources section of every container in the pod:

    kubectl get po <pod_name> -n stacklight -o yaml
    
  • nodeSelector.default

  • nodeSelector.component

  • tolerations.default

  • tolerations.component

Verify that the appropriate components pods are located on the intended nodes:

kubectl get pod -o=custom-columns=NAME:.metadata.name,\
STATUS:.status.phase,NODE:.spec.nodeName -n stacklight
  • storage.defaultStorageClass

  • storage.componentStorageClasses

Verify that the appropriate components PVCs have been created according to the configured StorageClass:

kubectl get pvc -n stacklight
  • sfReporter.enabled

  • sfReporter.salesForce

  • sfReporter.cronjob

  1. Verify that Salesforce reporter is enabled. The SUSPEND field in the output must be False.

    kubectl get cronjob -n stacklight
    
  2. Verify that the Salesforce reporter configuration includes all expected queries:

    kubectl get configmap -n stacklight \
    sf-reporter-config -o yaml
    
  3. After cron job execution (by default, at midnight server time), obtain the Salesforce reporter pod name. The output should include the Salesforce reporter pod name and STATUS must be Completed.

    kubectl get pods -n stacklight
    
  4. Verify that Salesforce reporter successfully authenticates to Salesforce and creates records. The output must include the Salesforce authentication successful, Created record or Duplicate record and Updated record lines.

    kubectl logs -n stacklight <sf-reporter-pod-name>
    

ceph.enabled

  1. In the Grafana web UI, verify that Ceph dashboards are present in the list of dashboards and are populated with data.

  2. In the Prometheus web UI, click Alerts and verify that the list of alerts contains Ceph* alerts.

  • externalEndpointMonitoring.enabled

  • externalEndpointMonitoring.domains

  1. In the Prometheus web UI, navigate to Status -> Targets.

  2. Verify that the blackbox-external-endpoint target contains the configured domains (URLs).

ironic.endpoint

In the Grafana web UI, verify that the Ironic BM dashboard displays valuable data (no false-positive or empty panels).

metricFilter

  1. In the Prometheus web UI, navigate to Status > Configuration.

  2. Verify that the following fields in the metric_relabel_configs section for the kubernetes-nodes-cadvisor and prometheus-kube-state-metrics scrape jobs have the required configuration:

    • action is set to keep or drop

    • regex contains a regular expression with configured namespaces delimited by |

    • source_labels is set to [namespace]

  • sslCertificateMonitoring.enabled

  • sslCertificateMonitoring.domains

  1. In the Prometheus web UI, navigate to Status -> Targets.

  2. Verify that the blackbox target contains the configured domains (URLs).

mke.enabled

  1. In the Grafana web UI, verify that the MKE Cluster and MKE Containers dashboards are present and not empty.

  2. In the Prometheus web UI, navigate to Alerts and verify that the MKE* alerts are present in the list of alerts.

mke.dockerdDataRoot

In the Prometheus web UI, navigate to Alerts and verify that the MKEAPIDown is not false-positively firing due to the certificate absence.

prometheusServer.customAlerts

In the Prometheus web UI, navigate to Alerts and verify that the list of alerts has changed according to your customization.

prometheusServer.watchDogAlertEnabled

In the Prometheus web UI, navigate to Alerts and verify that the list of alerts contains the Watchdog alert.

alertmanagerSimpleConfig.genericReceivers

In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended receiver(s).

alertmanagerSimpleConfig.genericRoutes

In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended route(s).

  • alertmanagerSimpleConfig.email.enabled

  • alertmanagerSimpleConfig.email

  • alertmanagerSimpleConfig.email.route

In the Alertmanager web UI, navigate to Status and verify that the Config section contains the Email receiver and route.

  • alertmanagerSimpleConfig.salesForce.enabled

  • alertmanagerSimpleConfig.salesForce.auth

  • alertmanagerSimpleConfig.salesForce.route

  1. Verify that sf-notifier is enabled. The output must include the sf-notifier pod name, 1/1 in the READY field and Running in the STATUS field.

    kubectl get pods -n stacklight
    
  2. Verify that sf-notifier successfully authenticates to Salesforce. The output must include the Salesforce authentication successful line.

    kubectl logs -f -n stacklight <sf-notifier-pod-name>
    
  3. In the Alertmanager web UI, navigate to Status and verify that the Config section contains the HTTP-salesforce receiver and route.

  • alertmanagerSimpleConfig.slack.enabled

  • alertmanagerSimpleConfig.slack.api_url

  • alertmanagerSimpleConfig.slack.channel

  • alertmanagerSimpleConfig.slack.route

In the Alertmanager web UI, navigate to Status and verify that the Config section contains the HTTP-slack receiver and route.