Collect the bootstrap logs

If the bootstrap script fails during the deployment process, collect and inspect the bootstrap and management cluster logs.

Collect the bootstrap cluster logs

  1. Log in to your local machine where the bootstrap script was executed.

  2. If you bootstrapped the cluster a while ago, verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  3. Run the following command:

    ./bootstrap.sh collect_logs
    

    Add COLLECT_EXTENDED_LOGS=true before the command to output the extended version of logs that contains system and MKE logs, logs from LCM Ansible and LCM Agent along with cluster events and Kubernetes resources description and logs.

    Without the --extended flag, the basic version of logs is collected, which is sufficient for most use cases. The basic version of logs contains all events, Kubernetes custom resources, and logs from all Container Cloud components. This version does not require passing --key-file.

    The logs are collected in the directory where the bootstrap script is located.

  4. Technology Preview. For bare metal clusters, assess the Ironic pod logs:

    • Extract the content of the 'message' fields from every log message:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM '.message'
      
    • Extract the content of the 'message' fields from the ironic_conductor source log messages:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM 'select(.source == "ironic_conductor") | .message'
      

    The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.

Logs structure

The Container Cloud logs structure in <output_dir>/<cluster_name>/ is as follows:

  • /events.log - human-readable table that contains information about the cluster events.

  • /system - system logs.

  • /system/mke (or /system/MachineName/mke) - Mirantis Kuberntes Engine (MKE) logs.

  • /objects/cluster - logs of the non-namespaced Kubernetes objects.

  • /objects/namespaced - logs of the namespaced Kubernetes objects.

  • /objects/namespaced/<namespaceName>/core/pods - pods logs from a specified Kubernetes namespace.

  • /objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log - logs of the pods from a specified Kubernetes namespace that were previously removed or failed.

  • /objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log - Technology Preview. Ironic pod logs of the bare metal clusters.

    Note

    Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in /volume/log/ironic/ansible_conductor.log inside the Ironic pod.

Each log entry of the management cluster logs contains a request ID that identifies chronology of actions performed on a cluster or machine. The format of the log entry is as follows:

<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>

For example, os.machine.req:28 contains information about the task 28 applied to an OpenStack machine.

Since Container Cloud 2.22.0, the logging format has the following extended structure for the admission-controller, storage-discovery, and all supported <providerName>-provider services of a management cluster:

level:<debug,info,warn,error,panic>,
ts:<YYYY-MM-DDTHH:mm:ssZ>,
logger:<processID>.<subProcessID(s)>.req:<requestID>,
caller:<lineOfCode>,
msg:<message>,
error:<errorMessage>,
stacktrace:<codeInfo>

Since Container Cloud 2.23.0, this structure also applies to the <name>-controller services of a management cluster.

Example of a log extract for openstack-provider since 2.22.0
{"level":"error","ts":"2022-11-14T21:37:18Z","logger":"os.cluster.req:318","caller":"lcm/machine.go:808","msg":"","error":"could not determine machine demo-46880-bastion host name”,”stacktrace”:”sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm.GetMachineConditions\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/machine.go:808\nsigs.k8s.io/cluster-api-provider-openstack/pkg...."}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"service/reconcile.go:128","msg":"request: default/demo-46880-2"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:201","msg":"Reconciling Machine \"default/demo-46880-2\""}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:454","msg":"Checking if machine exists: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:327","msg":"Reconciling machine \"default/demo-46880-2\" triggers idempotent update"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:290","msg":"Updating machine: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:24Z","logger":"os.machine.req:476","caller":"lcm/machine.go:73","msg":"Machine in LCM cluster, reconciling LCM objects"}
{"level":"info","ts":"2022-11-14T21:37:26Z","logger":"os.machine.req:476","caller":"lcm/machine.go:902","msg":"Updating Machine default/demo-46880-2 conditions"}
  • level

    Informational level. Possible values: debug, info, warn, error, panic.

  • ts

    Time stamp in the <YYYY-MM-DDTHH:mm:ssZ> format. For example: 2022-11-14T21:37:23Z.

  • logger

    Details on the process ID being logged:

    • <processID>

      Primary process identifier. The list of possible values includes bm, os, vsphere, iam, and license. The iam and license values are available since Container Cloud 2.23.0.

    • <subProcessID(s)>

      One or more secondary process identifiers. The list of possible values includes cluster, machine, and controller. The controller value is available since Container Cloud 2.23.0.

    • req

      Request ID number that increases when a service performs the following actions:

      • Receives a request from Kubernetes about creating, updating, or deleting an object

      • Receives an HTTP request

      • Runs a background process

      The request ID allows combining all operations performed with an object within one request. For example, the result of a Machine object creation, update of its statuses, and so on has the same request ID.

  • caller

    Code line used to apply the corresponding action to an object.

  • msg

    Description of a deployment or update phase. If empty, it contains an error message with a stacktrace.

Depending on the type of issue found in logs, apply the corresponding fixes. For example, if you detect the LoadBalancer ERROR state errors during the bootstrap of an OpenStack-based management cluster, contact your system administrator to fix the issue. To troubleshoot other issues, refer to the corresponding section in Troubleshooting.