Configure monitoring of instance availability

Available since MOSK 23.2 TechPreview

MOSK provides the OpenStack workload monitoring feature through the Cloudprober exporter. This section explains the monitoring configuration details.

For the feature description and usage requirements, refer to Reference Architecture: Workload monitoring.

Enable OpenStack instances monitoring

  1. Enable the Cloudprober service in the OpenStackDeployment custom resource:

    spec:
      features:
        services:
          - cloudprober
    
  2. Wait untill the OpenStackDeployment state becomes applied:

    kubectl -n openstack get osdplst
    

    Example of a positive system response:

    NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE
    osh-dev   yoga                0.13.1.dev54         APPLIED
    
  3. Verify that the Cloudprober service has been deployed:

    kubectl -n openstack get pods -l application=cloudprober
    

    Example of a positive system response:

    NAME                                     READY   STATUS    RESTARTS   AGE
    openstack-cloudprober-587b4bf7c4-lwmxx   2/2     Running   2          3d1h
    openstack-cloudprober-587b4bf7c4-v9tt9   2/2     Running   0          3d1h
    
  4. Verify that the Cloudprober service is operational:

    1. Log in to the StackLight Prometheus web UI.

    2. Navigate to Status - Targets.

    3. Search for the openstack-cloudprober target and verify that it is UP.

Configure security groups for monitoring

By default, for outgoing traffic, the IP address for the Cloudprober Pod is translated to the node IP address. In this procedure, we assume no further translation of that node IP address on the path between the node and floating network.

  1. Identify the node IP address used for traffic destined to floating network by selecting the IP address from the floating network and running the following command on each OpenStack control plane node:

    ip r get <floating ip> | grep -E -o '(src .*)' | awk '{print $2}'
    
  2. In the project where monitored virtual machines are running, create a security group:

    openstack security group create --project <project_id> instance-monitoring
    
  3. Create the rule for each IP address you obtain in step 1:

    openstack security group rule create --proto icmp --ingress --remote-ip <node ip> instance-monitoring
    

Add instances to monitoring

  1. Log in to the keystone-client Pod to assign the openstack.lcm.mirantis.com:prober tag to each instance to be added to monitoring:

    openstack --os-compute-api-version 2.26 server set --tag openstack.lcm.mirantis.com:prober <INSTANCE_ID>
    
  2. Assign the instance-monitoring security group to the server:

    openstack server add security group <SERVER_ID> <SECURITY_GROUP_ID>
    
  3. Verify that the instances have been added successfully.

    Cloudprober uses auto-discovery of instances on periodic basis. Therefore, wait for the discovery interval to pass (defaults to 600 seconds) and execute the following command inside the keystone-client Pod:

    curl -s http://cloudprober.openstack.svc.cluster.local:9313/metrics | grep <INSTANCE_ID>
    

    Example of a positive system response:

    cloudprober_total{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266388 1685963215202
    cloudprober_success{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266386 1685963215202
    cloudprober_latency{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 315484742.137 1685963215202
    cloudprober_validation_failure{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7",validator="data-integrity"} 0 1685963215202
    

    Note

    You can adjust the instance auto-discovery interval in the OpenStackDeployment object. However, Mirantis does not recommend setting it to too low values to avoid high load on the OpenStack API:

    spec:
      features:
        cloudprober:
          discovery:
            interval: 300
    

Now, you can start monitoring the availability of instance floating IP addresses per OpenStack compute node and project, as well as viewing the probe statistics for individual instance floating IP addresses through the Openstack Instances Availability dashboard in Grafana.