Configure monitoring of cloud workload availability

MOSK enables cloud operators to oversee the availability of workloads hosted in their OpenStack infrastructure through the monitoring of floating IP addresses availability (Cloudpprober) and network port availability (Portprober).

For the feature description and usage requirements, refer to Workload monitoring.

Configure floating IP address availability monitoring

Available since MOSK 23.2 TechPreview

MOSK allows you to monitor the floating IP address availability through the Cloudprober service. This section explains the details of the service configuration.

Enable the Cloudprober service

  1. Enable the Cloudprober service in the OpenStackDeployment custom resource:

          - cloudprober
  2. Wait untill the OpenStackDeployment state becomes applied:

    kubectl -n openstack get osdplst

    Example of a positive system response:

    osh-dev   yoga                0.13.1.dev54         APPLIED
  3. Verify that the Cloudprober service is running:

    kubectl -n openstack get pods -l application=cloudprober

    Example of a positive system response:

    NAME                                     READY   STATUS    RESTARTS   AGE
    openstack-cloudprober-587b4bf7c4-lwmxx   2/2     Running   2          3d1h
    openstack-cloudprober-587b4bf7c4-v9tt9   2/2     Running   0          3d1h
  4. Verify that the Cloudprober service is sending data to StackLight:

    1. Log in to the StackLight Prometheus web UI.

    2. Navigate to Status - Targets.

    3. Search for the openstack-cloudprober target and verify that it is UP.

Configure security groups

By default, for outgoing traffic, the IP address for the Cloudprober Pod is translated to the node IP address. In this procedure, we assume no further translation of that node IP address on the path between the node and floating network.

  1. Identify the node IP address used for traffic destined to floating network by selecting the IP address from the floating network and running the following command on each OpenStack control plane node:

    ip r get <floating ip> | grep -E -o '(src .*)' | awk '{print $2}'
  2. In the project where monitored virtual machines are running, create a security group:

    openstack security group create --project <project_id> instance-monitoring
  3. Create the rule for each IP address you obtain in step 1:

    openstack security group rule create --proto icmp --ingress --remote-ip <node ip> instance-monitoring

Mark instances with floating IPs for monitoring

  1. Log in to the keystone-client Pod to assign the tag to each instance to be added to monitoring:

    openstack --os-compute-api-version 2.26 server set --tag <INSTANCE_ID>
  2. Assign the instance-monitoring security group to the server:

    openstack server add security group <SERVER_ID> <SECURITY_GROUP_ID>
  3. Verify that the instances have been added successfully.

    Cloudprober uses auto-discovery of instances on periodic basis. Therefore, wait for the discovery interval to pass (defaults to 600 seconds) and execute the following command inside the keystone-client Pod:

    curl -s http://cloudprober.openstack.svc.cluster.local:9313/metrics | grep <INSTANCE_ID>

    Example of a positive system response:

    cloudprober_total{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266388 1685963215202
    cloudprober_success{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266386 1685963215202
    cloudprober_latency{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 315484742.137 1685963215202
    cloudprober_validation_failure{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7",validator="data-integrity"} 0 1685963215202


    You can adjust the instance auto-discovery interval in the OpenStackDeployment object. However, Mirantis does not recommend setting it to too low values to avoid high load on the OpenStack API:

            interval: 300

Now, you can start seeing the availability of instances floating IP addresses per OpenStack compute node and project, as well as viewing the probe statistics for individual instance floating IP addresses through the OpenStack Instances Availability dashboard in Grafana.

Enable network port availability monitoring

Available since MOSK 24.2 TechPreview

MOSK allows you to monitor the network port availability through the Portprober service.

The Portprober service is enabled by default when the Cloudprober service is enabled as described above, on clouds running OpenStack Antelope or newer version and using Neutron OVS backend for networking.

Also, you can enable Portprober explicitly, regardless of whether Cloudprober is enabled or not. See Network port availability monitoring (Portprober) for details.

When the service is enabled, you can monitor the network port availability through the OpenStack PortProber dashboard in Grafana.