Configure monitoring of cloud workload availability¶
MOSK enables cloud operators to oversee the availability of workloads hosted in their OpenStack infrastructure through the monitoring of floating IP addresses availability (Cloudpprober) and network port availability (Portprober).
For the feature description and usage requirements, refer to Workload monitoring.
Configure floating IP address availability monitoring¶
Available since MOSK 23.2 TechPreview
MOSK allows you to monitor the floating IP address availability through the Cloudprober service. This section explains the details of the service configuration.
Enable the Cloudprober service¶
Enable the Cloudprober service in the
OpenStackDeployment
custom resource:spec: features: services: - cloudprober
Wait untill the
OpenStackDeployment
state becomesapplied
:kubectl -n openstack get osdplst
Example of a positive system response:
NAME OPENSTACK VERSION CONTROLLER VERSION STATE osh-dev yoga 0.13.1.dev54 APPLIED
Verify that the Cloudprober service is running:
kubectl -n openstack get pods -l application=cloudprober
Example of a positive system response:
NAME READY STATUS RESTARTS AGE openstack-cloudprober-587b4bf7c4-lwmxx 2/2 Running 2 3d1h openstack-cloudprober-587b4bf7c4-v9tt9 2/2 Running 0 3d1h
Verify that the Cloudprober service is sending data to StackLight:
Log in to the StackLight Prometheus web UI.
Navigate to Status - Targets.
Search for the
openstack-cloudprober
target and verify that it isUP
.
Configure security groups¶
By default, for outgoing traffic, the IP address for the Cloudprober Pod is translated to the node IP address. In this procedure, we assume no further translation of that node IP address on the path between the node and floating network.
Identify the node IP address used for traffic destined to floating network by selecting the IP address from the floating network and running the following command on each OpenStack control plane node:
ip r get <floating ip> | grep -E -o '(src .*)' | awk '{print $2}'
In the project where monitored virtual machines are running, create a security group:
openstack security group create --project <project_id> instance-monitoring
Create the rule for each IP address you obtain in step 1:
openstack security group rule create --proto icmp --ingress --remote-ip <node ip> instance-monitoring
Mark instances with floating IPs for monitoring¶
Log in to the
keystone-client
Pod to assign theopenstack.lcm.mirantis.com:prober
tag to each instance to be added to monitoring:openstack --os-compute-api-version 2.26 server set --tag openstack.lcm.mirantis.com:prober <INSTANCE_ID>
Assign the
instance-monitoring
security group to the server:openstack server add security group <SERVER_ID> <SECURITY_GROUP_ID>
Verify that the instances have been added successfully.
Cloudprober uses auto-discovery of instances on periodic basis. Therefore, wait for the discovery interval to pass (defaults to 600 seconds) and execute the following command inside the
keystone-client
Pod:curl -s http://cloudprober.openstack.svc.cluster.local:9313/metrics | grep <INSTANCE_ID>
Example of a positive system response:
cloudprober_total{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266388 1685963215202 cloudprober_success{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266386 1685963215202 cloudprober_latency{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 315484742.137 1685963215202 cloudprober_validation_failure{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7",validator="data-integrity"} 0 1685963215202
Note
You can adjust the instance auto-discovery interval in the
OpenStackDeployment
object. However, Mirantis does not recommend setting it to too low values to avoid high load on the OpenStack API:spec: features: cloudprober: discovery: interval: 300
Now, you can start seeing the availability of instances floating IP addresses per OpenStack compute node and project, as well as viewing the probe statistics for individual instance floating IP addresses through the OpenStack Instances Availability dashboard in Grafana.
See also
Enable network port availability monitoring¶
Available since MOSK 24.2 TechPreview
MOSK allows you to monitor the network port availability through the Portprober service.
The Portprober service is enabled by default when the Cloudprober service is enabled as described above, on clouds running OpenStack Antelope or newer version and using Neutron OVS backend for networking.
Also, you can enable Portprober explicitly, regardless of whether Cloudprober is enabled or not. See Network port availability monitoring (Portprober) for details.
When the service is enabled, you can monitor the network port availability through the OpenStack PortProber dashboard in Grafana.
See also