Troubleshoot your MSR Kubernetes deployment

You can use general Kubernetes troubleshooting and debugging techniques to troubleshoot your MSR Kubernetes deployment.

To review an example of a failed Pod:

kubectl get pods

Example output:

NAME                                     READY   STATUS              RESTARTS      AGE
msr-api-95dc9979b-4sgfg                  1/1     Running             3 (54s ago)   99s
msr-enzi-api-6f6f54c4c5-72bkb            1/1     Running             1 (39s ago)   100s
msr-enzi-worker-55b5786699-pnlh4         1/1     Running             3 (81s ago)   100s
msr-garant-84c5d9489b-t4bl4              1/1     Running             3 (51s ago)   100s
msr-jobrunner-default-7fcc9bb849-4whcl   1/1     Running             3 (54s ago)   100s
msr-nginx-76dbf47797-slllp               0/1     ContainerCreating   0             99s
msr-notary-server-6dfb9c67c9-mft97       1/1     Running             2 (85s ago)   99s
msr-notary-signer-576c5f574b-ftm5z       1/1     Running             2 (90s ago)   99s
msr-registry-7df8fd6fcd-l67d6            1/1     Running             3 (51s ago)   100s
msr-rethinkdb-cluster-0                  1/1     Running             0             100s
msr-rethinkdb-proxy-d5798dd75-ft75c      1/1     Running             2 (85s ago)   99s
msr-scanningstore-0                      1/1     Running             0             99s
postgres-operator-569b58b8c6-c6vxv       1/1     Running             0             32h
postgres-operator-ui-7b9f8d69bc-pv9nm    1/1     Running             0             32h

To review a greater amount of information about a failed Pod:

kubectl get pods -o wide

Example output:

NAME                                     READY   STATUS              RESTARTS        AGE     IP            NODE       NOMINATED NODE   READINESS GATES
msr-api-95dc9979b-4sgfg                  1/1     Running             3 (2m48s ago)   3m33s   172.17.0.14   minikube   <none>           <none>
msr-enzi-api-6f6f54c4c5-72bkb            1/1     Running             1 (2m33s ago)   3m34s   172.17.0.13   minikube   <none>           <none>
msr-enzi-worker-55b5786699-pnlh4         1/1     Running             3 (3m15s ago)   3m34s   172.17.0.8    minikube   <none>           <none>
msr-garant-84c5d9489b-t4bl4              1/1     Running             3 (2m45s ago)   3m34s   172.17.0.11   minikube   <none>           <none>
msr-jobrunner-default-7fcc9bb849-4whcl   1/1     Running             3 (2m48s ago)   3m34s   172.17.0.9    minikube   <none>           <none>
msr-nginx-76dbf47797-slllp               0/1     ContainerCreating   0               3m33s   <none>        minikube   <none>           <none>
msr-notary-server-6dfb9c67c9-mft97       1/1     Running             3 (51s ago)     3m33s   172.17.0.18   minikube   <none>           <none>
msr-notary-signer-576c5f574b-ftm5z       1/1     Running             3 (57s ago)     3m33s   172.17.0.12   minikube   <none>           <none>
msr-registry-7df8fd6fcd-l67d6            1/1     Running             3 (2m45s ago)   3m34s   172.17.0.15   minikube   <none>           <none>
msr-rethinkdb-cluster-0                  1/1     Running             0               3m34s   172.17.0.10   minikube   <none>           <none>
msr-rethinkdb-proxy-d5798dd75-ft75c      1/1     Running             2 (3m19s ago)   3m33s   172.17.0.17   minikube   <none>           <none>
msr-scanningstore-0                      1/1     Running             0               3m33s   172.17.0.16   minikube   <none>           <none>
postgres-operator-569b58b8c6-c6vxv       1/1     Running             0               32h     172.17.0.7    minikube   <none>           <none>
postgres-operator-ui-7b9f8d69bc-pv9nm    1/1     Running             0               32h     172.17.0.6    minikube   <none>           <none>

To review the Pods running in all namespaces:

kubectl get pods --all-namespaces

Example output:

NAMESPACE      NAME                                       READY   STATUS              RESTARTS        AGE
cert-manager   cert-manager-7dd5854bb4-hx7mj              1/1     Running             1 (7d5h ago)    7d9h
cert-manager   cert-manager-cainjector-64c949654c-gwvgg   1/1     Running             2 (2d9h ago)    7d9h
cert-manager   cert-manager-webhook-6b57b9b886-7prtc      1/1     Running             1 (2d9h ago)    7d9h
default        msr-api-95dc9979b-4sgfg                    1/1     Running             3 (4m44s ago)   5m29s
default        msr-enzi-api-6f6f54c4c5-72bkb              1/1     Running             1 (4m29s ago)   5m30s
default        msr-enzi-worker-55b5786699-pnlh4           1/1     Running             3 (5m11s ago)   5m30s
default        msr-garant-84c5d9489b-t4bl4                1/1     Running             3 (4m41s ago)   5m30s
default        msr-jobrunner-default-7fcc9bb849-4whcl     1/1     Running             3 (4m44s ago)   5m30s
default        msr-nginx-76dbf47797-slllp                 0/1     ContainerCreating   0               5m29s
default        msr-notary-server-6dfb9c67c9-mft97         1/1     Running             3 (2m47s ago)   5m29s
default        msr-notary-signer-576c5f574b-ftm5z         1/1     Running             3 (2m53s ago)   5m29s
default        msr-registry-7df8fd6fcd-l67d6              1/1     Running             3 (4m41s ago)   5m30s
default        msr-rethinkdb-cluster-0                    1/1     Running             0               5m30s
default        msr-rethinkdb-proxy-d5798dd75-ft75c        1/1     Running             2 (5m15s ago)   5m29s
default        msr-scanningstore-0                        1/1     Running             0               5m29s
default        postgres-operator-569b58b8c6-c6vxv         1/1     Running             0               32h
default        postgres-operator-ui-7b9f8d69bc-pv9nm      1/1     Running             0               32h
kube-system    coredns-78fcd69978-48bfx                   1/1     Running             1 (7d5h ago)    7d9h
kube-system    etcd-minikube                              1/1     Running             1 (2d9h ago)    7d9h
kube-system    kube-apiserver-minikube                    1/1     Running             1 (2d9h ago)    7d9h
kube-system    kube-controller-manager-minikube           1/1     Running             1 (7d5h ago)    7d9h
kube-system    kube-proxy-2h2z5                           1/1     Running             1 (2d9h ago)    7d9h
kube-system    kube-scheduler-minikube                    1/1     Running             1 (2d9h ago)    7d9h
kube-system    storage-provisioner                        1/1     Running             2 (2d9h ago)    7d9h

To review all services:

kubectl get services

Example output:

NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)            AGE
kubernetes                 ClusterIP   10.96.0.1        <none>        443/TCP            7d10h
msr                        ClusterIP   10.98.33.163     <none>        8080/TCP,443/TCP   8m14s
msr-api                    ClusterIP   10.102.145.77    <none>        443/TCP            8m14s
msr-enzi                   ClusterIP   10.102.7.61      <none>        4443/TCP           8m14s
msr-garant                 ClusterIP   10.102.139.182   <none>        443/TCP            8m14s
msr-notary                 ClusterIP   10.107.27.10     <none>        443/TCP            8m14s
msr-notary-signer          ClusterIP   10.103.28.108    <none>        7899/TCP           8m14s
msr-registry               ClusterIP   10.109.12.52     <none>        443/TCP            8m14s
msr-rethinkdb-admin        ClusterIP   None             <none>        8080/TCP           8m14s
msr-rethinkdb-cluster      ClusterIP   None             <none>        29015/TCP          8m14s
msr-rethinkdb-proxy        ClusterIP   10.103.235.96    <none>        28015/TCP          8m14s
msr-scanningstore          ClusterIP   10.99.62.126     <none>        5432/TCP           8m13s
msr-scanningstore-config   ClusterIP   None             <none>        <none>             7m56s
msr-scanningstore-repl     ClusterIP   10.107.82.163    <none>        5432/TCP           8m13s
postgres-operator          ClusterIP   10.108.77.171    <none>        8080/TCP           32h
postgres-operator-ui       ClusterIP   10.108.138.75    <none>        80/TCP             32h

To review the state of a running or failed Pod:

kubectl describe pod msr-nginx-76dbf47797-slllp

Example output, including status, environment variables, certificates used, and recent events such as why the Pod might have failed to start:

Name:           msr-nginx-76dbf47797-slllp
Namespace:      default
Priority:       0
Node:           minikube/192.168.49.2
Start Time:     Wed, 17 Nov 2021 19:22:17 -0500
Labels:         app.kubernetes.io/component=nginx
app.kubernetes.io/instance=msr
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=msr
app.kubernetes.io/version=3.0.0-tp2
helm.sh/chart=msr-1.0.0-tp2.1
pod-template-hash=76dbf47797
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/msr-nginx-76dbf47797

   .
   .
   .
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/arch=amd64
kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type     Reason       Age                   From               Message

Normal   Scheduled    9m17s                 default-scheduler  Successfully assigned default/msr-nginx-76dbf47797-slllp to minikube
Warning  FailedMount  58s (x12 over 9m13s)  kubelet            MountVolume.SetUp failed for volume "secrets" : secret "bad" not found
Warning  FailedMount  27s (x4 over 7m15s)   kubelet            Unable to attach or mount volumes: unmounted volumes=[secrets], unattached volumes=[secrets kube-api-access-6h99g]: timed out waiting for the condition

To view the Pod logs:

kubectl get logs <pod-name>

To create a shell to examine things from inside a Pod:

kubectl exec --stdin --tty <pod-name> -- /bin/sh