Using KOF (without Grafana)#

Optional Grafana, Mirantis Grafana Dashboards, and grafana-operator#

!!! note "Using Grafana with KOF" Effective immediately, Mirantis will no longer distribute Grafana as part of its products or services. This change is being made to proactively avoid potential licensing, redistribution, or compliance considerations related to third-party software. For more information, please contact Mirantis

If you want to install and enable Grafana and use the Mirantis-provided Grafana dashboards, please see Grafana in KOF.

This document explains how to use KOF without Grafana, using other provided user interfaces.

Metrics and alerts#

Prometheus UI:

Run in the management cluster:

kubectl port-forward -n kof svc/kof-mothership-promxy 8082:8082

Explore the Graph: http://127.0.0.1:8082/graph?g0.expr=up&g0.tab=0
Explore the Alerts: http://127.0.0.1:8082/alerts

CLI queries for automation:

curl http://localhost:8082/api/v1/query?query=up \
  | jq '.data.result | map(.metric.cluster) | unique'

curl http://localhost:8082/api/v1/query?query=up \
  | jq '.data.result | map(.metric.job) | unique'

curl http://localhost:8082/api/v1/query \
  -d 'query=up{cluster="mothership", job="kof-collectors-opencost"}' \
  | jq

Alertmanager UI:
- Run in the management cluster:
```
kubectl port-forward -n kof svc/vmalertmanager-cluster 9093:9093
```
- Open http://127.0.0.1:9093/
VictoriaMetrics UI:
- Run in the regional cluster:
```
KUBECONFIG=regional-kubeconfig kubectl port-forward \
  -n kof svc/vmselect-cluster 8481:8481
```
  To get metrics stored from Management to Management (if any), do this port-forward in the management cluster.
- Open http://127.0.0.1:8481/select/0/vmui/#/dashboards

Logs#

VictoriaLogs UI:

Run in the regional cluster:

KUBECONFIG=regional-kubeconfig kubectl port-forward \
  -n kof svc/kof-storage-victoria-logs-cluster-vlselect 9471:9471

We're using port 9471, not 9428.

Open http://127.0.0.1:9471/select/vmui/

CLI query for automation:

curl http://127.0.0.1:9471/select/logsql/query \
  -d 'query=_time:1h' \
  -d 'limit=10'

Run inside of Istio mesh:

curl http://$REGIONAL_CLUSTER_NAME-logs-select:9471/select/logsql/query \
  -d 'query=_time:1h' \
  -d 'limit=10'

Run without Istio and port-forwarding:

VM_USER=$(
  kubectl get secret -n kof storage-vmuser-credentials -o yaml \
  | yq .data.username | base64 -d
)
VM_PASS=$(
  kubectl get secret -n kof storage-vmuser-credentials -o yaml \
  | yq .data.password | base64 -d
)
curl https://vmauth.$REGIONAL_DOMAIN/vls/select/logsql/query \
  -u "$VM_USER":"$VM_PASS" \
  -d 'query=_time:1h' \
  -d 'limit=10'

Traces#

Jaeger will be replaced with VictoriaTraces UI soon.

Jaeger UI of each regional cluster can be accessed by following these steps:

Ensure you have the regional-kubeconfig file created on the verification step.
If you've applied the Istio section:
- Forward a port to the Jaeger UI:
```
KUBECONFIG=regional-kubeconfig kubectl port-forward \
  -n kof svc/kof-storage-jaeger-query 16686:16686
```
- Open the link http://127.0.0.1:16686/search and explore the Jaeger UI.
If you have not applied the Istio section:
- Ensure you have the REGIONAL_DOMAIN variable set on the installation step.
- Get the regional Jaeger username and password:
```
KUBECONFIG=regional-kubeconfig kubectl get secret \
  -n kof jaeger-credentials -o yaml | yq '{
  "user": .data.username | @base64d,
  "pass": .data.password | @base64d
}'
```
- Get the the Jaeger UI URL, open it, and login with the username/password printed above:
```
echo https://jaeger.$REGIONAL_DOMAIN
```

Cost Management (OpenCost)#

KOF includes OpenCost, which provides cost management features for Kubernetes clusters. Common metrics (also available in the pre-installed Grafana FinOps dashboards if enabled) are:

node_total_hourly_cost (per-node hourly cost)
Namespace and pod-level cost allocation
Historical spend trends and efficiency ratios

Metric	Description
`node_total_hourly_cost`	Hourly cost per node (includes CPU, memory, storage)
`namespace_cpu_cost`	CPU cost aggregated by namespace
`namespace_memory_cost`	Memory cost aggregated by namespace
`pod_cost`	Cost allocation at pod granularity
`cluster_efficiency`	Ratio of requested vs actual resource usage

Once you have this information, you can optimize your cluster. Typical optimizations include:

Identify under-utilized resources and right-size workloads
Budgeting and monitoring with alerts

KOF UI#

When the TargetAllocator is in use, the configuration of OpenTelemetryCollectors Prometheus receivers is distributed across the cluster.

The KOF UI collects metrics metadata from the same endpoints that are scraped by the Prometheus server:

graph TB
    KOF_UI[KOF UI] --> C1OTC11
    KOF_UI --> C1OTC1N
    KOF_UI --> C1OTC21
    KOF_UI --> C1OTC2N
    KOF_UI --> C2OTC11
    KOF_UI --> C2OTC1N
    KOF_UI --> C2OTC21
    KOF_UI --> C2OTC2N
    subgraph Cluster1
    subgraph C1Node1[Node 1]
        C1OTC11[OTel Collector]
        C1OTC1N[OTel Collector]
    end
    subgraph C1NodeN[Node N]
        C1OTC21[OTel Collector]
        C1OTC2N[OTel Collector]
    end

    C1OTC11 --PrometheusReceiver--> C1TA[TargetAllocator]
    C1OTC1N --PrometheusReceiver--> C1TA
    C1OTC21 --PrometheusReceiver--> C1TA
    C1OTC2N --PrometheusReceiver--> C1TA
    end
    subgraph Cluster2
    subgraph C2Node1[Node 1]
        C2OTC11[OTel Collector]
        C2OTC1N[OTel Collector]
    end
    subgraph C2NodeN[Node N]
        C2OTC21[OTel Collector]
        C2OTC2N[OTel Collector]
    end

    C2OTC11 --PrometheusReceiver--> C2TA[TargetAllocator]
    C2OTC1N --PrometheusReceiver--> C2TA
    C2OTC21 --PrometheusReceiver--> C2TA
    C2OTC2N --PrometheusReceiver--> C2TA
    end

You can access the KOF UI by following these steps:

Forward a port to the KOF UI:

kubectl port-forward -n kof deploy/kof-mothership-kof-operator 9090:9090

Open the link http://127.0.0.1:9090
Check the state of the endpoints:

kof-ui-demo

If there is a misconfiguration in the Prometheus targets (for example, if multiple targets scrape the same URL), the UI will display an error:

kof-ui-prometheus-targets-misconfiguration

The KOF UI also allows you to monitor internal telemetry from OpenTelemetry collectors and VictoriaMetrics/Logs, enabling comprehensive observability of their health and performance.

kof-ui-collectors-metrics

To identify and debug issues in deployed clusters, check if KOF UI shows any errors in these monitored resources:

ClusterDeployment
ClusterSummaries
MultiClusterService
ServiceSet
StateManagementProvider
SveltosCluster

kof-ui-resources-monitoring