Troubleshoot Host Operating System Modules Controller alerts

This section describes the investigation and troubleshooting steps for the Host Operating System Modules Controller alerts.


Day2ManagementControllerTargetDown

Root cause

Prometheus failed to scrape the host-os-modules-controller metrics because of the host-os-modules-controller Pod outage or application error.

Investigation

  1. Verify the status of the host-os-modules-controller Pod:

    kubectl get pod -n kaas \
    -l=app.kubernetes.io/name=host-os-modules-controller
    
  2. Inspect the Kubernetes Pod events, if available:

    kubectl describe pod -n kaas <pod_name>
    

    Alternatively:

    1. In the Discover section of OpenSearch Dashboards, change the index pattern to kubernetes_events-*.

    2. Expand the required time range and filter the results by kubernetes.event.involved_object.name that equals the <pod_name>.

    3. In the matched results, inspect the kubernetes.event.message field.

  3. Inspect the host-os-modules-controller logs for error or warning messages:

    kubectl logs -n kaas <pod_name>
    

For further steps, see the Investigation section of the KubePodsCrashLooping alert.

Mitigation

Refer to the Mitigation section of the KubePodsCrashLooping alert.

Day2ManagementDeprecatedConfigs

Root cause

One or more HostOSConfiguration objects contain configuration for one or more deprecated modules.

Investigation

  1. Identify the names of HostOSConfiguration objects with configured deprecated modules:

    kubectl get hoc -A -o go-template="{{range .items}}{{if .status.containsDeprecatedModules}}\
    {{.metadata.namespace}}/{{.metadata.name}}{{end}}{{\"\n\"}}{{end}}"
    
  2. For each object, identify the deprecated modules being used and the list of modules that replace the deprecated ones:

    kubectl get hoc <hoc_name> -n <hoc_namespace> \
    -o go-template="{{range .status.configs}}\
    {{if .moduleDeprecatedBy}}\
    deprecated: {{.moduleName}}-{{.moduleVersion}};\
    update to: {{range .moduleDeprecatedBy}}{{.name}}-{{.version}}; {{end}}\
    {{\"\n\"}}\
    {{end}}\
    {{end}}"
    
  3. Read through the documentation or README of the new module, and manually update all affected HostOSConfiguration objects migrating from the deprecated version.

Mitigation

Use up-to-date versions of modules during configuration.