Collect the bootstrap logs¶
If the bootstrap process is stuck or fails, collect and inspect the bootstrap and management cluster logs.
To collect the bootstrap logs:
If the Cluster object is not created yet
List all available deployments:
kubectl --kubeconfig <pathToKindKubeconfig> \ -n kaas get deploy
Collect the logs of the required deployment:
kubectl --kubeconfig <pathToKindKubeconfig> \ -n kaas logs -lapp.kubernetes.io/name=<deploymentName>
If the Cluster object is created
Select from the following options:
If a management cluster is not deployed yet:
CLUSTER_NAME=<clusterName> ./bootstrap.sh collect_logs
If a management cluster is deployed or pivoting is done:
Obtain the cluster
kubeconfig
:./container-cloud get cluster-kubeconfig \ --kubeconfig <pathToKindKubeconfig> \ --cluster-name <clusterName> \ --kubeconfig-output <pathToMgmtClusterKubeconfig>
Collect the logs:
CLUSTER_NAME=<cluster-name> \ KUBECONFIG=<pathToMgmtClusterKubeconfig> \ ./bootstrap.sh collect_logs
Technology Preview. For bare metal clusters, assess the Ironic pod logs:
Extract the content of the
'message'
fields from every log message:kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | .message'
Extract the content of the
'message'
fields from theironic_conductor
source log messages:kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | select(.source == "ironic_conductor") | .message'
The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.
Note
Add COLLECT_EXTENDED_LOGS=true
before the
collect_logs command to output the extended version of logs
that contains system and MKE logs, logs from LCM Ansible and LCM Agent
along with cluster events and Kubernetes resources description and logs.
Without the --extended
flag, the basic version of logs is collected, which
is sufficient for most use cases. The basic version of logs contains all
events, Kubernetes custom resources, and logs from all Container Cloud
components. This version does not require passing --key-file
.
The logs are collected in the directory where the bootstrap script is located.
Logs structure¶
The Container Cloud logs structure in <output_dir>/<cluster_name>/
is as follows:
/events.log
Human-readable table that contains information about the cluster events.
/system
System logs.
/system/mke
(or/system/MachineName/mke
)Mirantis Kuberntes Engine (MKE) logs.
/objects/cluster
Logs of the non-namespaced Kubernetes objects.
/objects/namespaced
Logs of the namespaced Kubernetes objects.
/objects/namespaced/<namespaceName>/core/pods
Logs of the pods from a specific Kubernetes namespace. For example, logs of the pods from the
kaas
namespace contain logs of Container Cloud controllers, includingbootstrap-cluster-controller
since Container Cloud 2.25.0.
/objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log
Logs of the pods from a specific Kubernetes namespace that were previously removed or failed.
/objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log
Technology Preview. Ironic pod logs of the bare metal clusters.Note
Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in
/volume/log/ironic/ansible_conductor.log
inside the Ironic pod.
Each log entry of the management cluster logs contains a request ID that identifies chronology of actions performed on a cluster or machine. The format of the log entry is as follows:
<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>
For example, os.machine.req:28
contains information about the task 28
applied to an OpenStack machine.
Since Container Cloud 2.22.0, the logging format has the following extended
structure for the admission-controller
, storage-discovery
, and all
supported <providerName>-provider
services of a management cluster:
level:<debug,info,warn,error,panic>,
ts:<YYYY-MM-DDTHH:mm:ssZ>,
logger:<processID>.<subProcessID(s)>.req:<requestID>,
caller:<lineOfCode>,
msg:<message>,
error:<errorMessage>,
stacktrace:<codeInfo>
Since Container Cloud 2.23.0, this structure also applies to the
<name>-controller
services of a management cluster.
Example of a log extract for openstack-provider
since 2.22.0
{"level":"error","ts":"2022-11-14T21:37:18Z","logger":"os.cluster.req:318","caller":"lcm/machine.go:808","msg":"","error":"could not determine machine demo-46880-bastion host name”,”stacktrace”:”sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm.GetMachineConditions\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/machine.go:808\nsigs.k8s.io/cluster-api-provider-openstack/pkg...."}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"service/reconcile.go:128","msg":"request: default/demo-46880-2"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:201","msg":"Reconciling Machine \"default/demo-46880-2\""}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:454","msg":"Checking if machine exists: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:327","msg":"Reconciling machine \"default/demo-46880-2\" triggers idempotent update"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:290","msg":"Updating machine: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:24Z","logger":"os.machine.req:476","caller":"lcm/machine.go:73","msg":"Machine in LCM cluster, reconciling LCM objects"}
{"level":"info","ts":"2022-11-14T21:37:26Z","logger":"os.machine.req:476","caller":"lcm/machine.go:902","msg":"Updating Machine default/demo-46880-2 conditions"}
level
Informational level. Possible values:
debug
,info
,warn
,error
,panic
.
ts
Time stamp in the
<YYYY-MM-DDTHH:mm:ssZ>
format. For example:2022-11-14T21:37:23Z
.
logger
Details on the process ID being logged:
<processID>
Primary process identifier. The list of possible values includes
bm
,os
,iam
,license
, andbootstrap
.Note
The
iam
andlicense
values are available since Container Cloud 2.23.0. Thebootstrap
value is available since Container Cloud 2.25.0.
<subProcessID(s)>
One or more secondary process identifiers. The list of possible values includes
cluster
,machine
,controller
, andcluster-ctrl
.Note
The
controller
value is available since Container Cloud 2.23.0. Thecluster-ctrl
value is available since Container Cloud 2.25.0 for thebootstrap
process identifier.
req
Request ID number that increases when a service performs the following actions:
Receives a request from Kubernetes about creating, updating, or deleting an object
Receives an HTTP request
Runs a background process
The request ID allows combining all operations performed with an object within one request. For example, the result of a
Machine
object creation, update of its statuses, and so on has the same request ID.
caller
Code line used to apply the corresponding action to an object.
msg
Description of a deployment or update phase. If empty, it contains the
"error"
key with a message followed by the"stacktrace"
key with stack trace details. For example:"msg"="" "error"="Cluster nodes are not yet ready" "stacktrace": "<stack-trace-info>"
The log format of the following Container Cloud components does not contain the
"stacktrace"
key for easier log handling:baremetal-provider
, bootstrap-provider, andhost-os-modules-controller
.
Note
Logs may also include a number of informational key-value pairs
containing additional cluster details. For example,
"name": "object-name", "foobar": "baz"
.
Depending on the type of issue found in logs, apply the corresponding fixes.
For example, if you detect the LoadBalancer ERROR state
errors
during the bootstrap of an OpenStack-based management cluster,
contact your system administrator to fix the issue.