Collect the bootstrap logs¶
If the bootstrap script fails during the deployment process, collect and inspect the bootstrap and management cluster logs.
Collect the bootstrap cluster logs¶
Log in to your local machine where the bootstrap script was executed.
If you bootstrapped the cluster a while ago, verify that the bootstrap directory is updated.
Select from the following options:
For clusters deployed using Container Cloud 2.11.0 or later:
./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \ --target-dir <pathToBootstrapDirectory>
For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the
kaas-bootstrap
folder, download and run the Container Cloud bootstrap script:wget https://binary.mirantis.com/releases/get_container_cloud.sh chmod 0755 get_container_cloud.sh ./get_container_cloud.sh
Run the following command:
./bootstrap.sh collect_logs
Add
COLLECT_EXTENDED_LOGS=true
before the command to output the extended version of logs that contains system and MKE logs, logs from LCM Ansible and LCM Agent along with cluster events and Kubernetes resources description and logs.Without the
--extended
flag, the basic version of logs is collected, which is sufficient for most use cases. The basic version of logs contains all events, Kubernetes custom resources, and logs from all Container Cloud components. This version does not require passing--key-file
.The logs are collected in the directory where the bootstrap script is located.
Technology Preview. For bare metal clusters, assess the Ironic pod logs:
Extract the content of the
'message'
fields from every log message:kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM '.message'
Extract the content of the
'message'
fields from theironic_conductor
source log messages:kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM 'select(.source == "ironic_conductor") | .message'
The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.
Logs structure¶
The Container Cloud logs structure in <output_dir>/<cluster_name>/
is as follows:
/events.log
- human-readable table that contains information about the cluster events./system
- system logs./system/mke
(or/system/MachineName/mke
) - Mirantis Kuberntes Engine (MKE) logs./objects/cluster
- logs of the non-namespaced Kubernetes objects./objects/namespaced
- logs of the namespaced Kubernetes objects./objects/namespaced/<namespaceName>/core/pods
- pods logs from a specified Kubernetes namespace./objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log
- logs of the pods from a specified Kubernetes namespace that were previously removed or failed./objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log
- Technology Preview. Ironic pod logs of the bare metal clusters.Note
Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in
/volume/log/ironic/ansible_conductor.log
inside the Ironic pod.
Each log entry of the management cluster logs contains a request ID that identifies chronology of actions performed on a cluster or machine. The format of the log entry is as follows:
<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>
For example, os.machine.req:28
contains information about the task 28
applied to an OpenStack machine.
Since Container Cloud 2.22.0, the logging format has the following extended
structure for the admission-controller
, storage-discovery
, and all
supported <providerName>-provider
services of a management cluster:
level:<debug,info,warn,error,panic>,
ts:<YYYY-MM-DDTHH:mm:ssZ>,
logger:<processID>.<subProcessID(s)>.req:<requestID>,
caller:<lineOfCode>,
msg:<message>,
error:<errorMessage>,
stacktrace:<codeInfo>
Since Container Cloud 2.23.0, this structure also applies to the
<name>-controller
services of a management cluster.
Example of a log extract for openstack-provider
since 2.22.0
{"level":"error","ts":"2022-11-14T21:37:18Z","logger":"os.cluster.req:318","caller":"lcm/machine.go:808","msg":"","error":"could not determine machine demo-46880-bastion host name”,”stacktrace”:”sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm.GetMachineConditions\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/machine.go:808\nsigs.k8s.io/cluster-api-provider-openstack/pkg...."}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"service/reconcile.go:128","msg":"request: default/demo-46880-2"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:201","msg":"Reconciling Machine \"default/demo-46880-2\""}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:454","msg":"Checking if machine exists: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:327","msg":"Reconciling machine \"default/demo-46880-2\" triggers idempotent update"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:290","msg":"Updating machine: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:24Z","logger":"os.machine.req:476","caller":"lcm/machine.go:73","msg":"Machine in LCM cluster, reconciling LCM objects"}
{"level":"info","ts":"2022-11-14T21:37:26Z","logger":"os.machine.req:476","caller":"lcm/machine.go:902","msg":"Updating Machine default/demo-46880-2 conditions"}
level
Informational level. Possible values:
debug
,info
,warn
,error
,panic
.
ts
Time stamp in the
<YYYY-MM-DDTHH:mm:ssZ>
format. For example:2022-11-14T21:37:23Z
.
logger
Details on the process ID being logged:
<processID>
Primary process identifier. The list of possible values includes
bm
,os
,vsphere
,iam
, andlicense
. Theiam
andlicense
values are available since Container Cloud 2.23.0.
<subProcessID(s)>
One or more secondary process identifiers. The list of possible values includes
cluster
,machine
, andcontroller
. Thecontroller
value is available since Container Cloud 2.23.0.
req
Request ID number that increases when a service performs the following actions:
Receives a request from Kubernetes about creating, updating, or deleting an object
Receives an HTTP request
Runs a background process
The request ID allows combining all operations performed with an object within one request. For example, the result of a
Machine
object creation, update of its statuses, and so on has the same request ID.
caller
Code line used to apply the corresponding action to an object.
msg
Description of a deployment or update phase. If empty, it contains an
error
message with astacktrace
.
Depending on the type of issue found in logs, apply the corresponding fixes.
For example, if you detect the LoadBalancer ERROR state
errors
during the bootstrap of an OpenStack-based management cluster,
contact your system administrator to fix the issue.
To troubleshoot other issues, refer to the corresponding section
in Troubleshooting.