This section describes basic troubleshooting steps for the OpenContrail-related services.
To perform initial troubleshooting:
Verify the NTP peers on every node of your MCP cluster:
ntpq -p
Example of system response:
remote refid st t when poll reach delay offset jitter
==============================================================================
+tik.cesnet.cz 195.113.144.238 2 u 728 1024 377 4.645 -0.199 0.545
*netopyr.hanacke .GPS. 1 u 1604 1024 276 14.931 -0.021 0.373
If at least one of peers has *
before its name, time is synchronized.
Otherwise, inspect the /etc/ntp.conf
file .
Example of an ntp.conf file
# Associate to cloud NTP pool servers
server ntp.cesnet.cz iburst
server pool.ntp.org
# Only allow read-only access from localhost
restrict default noquery nopeer
restrict 127.0.0.1
restrict ::1
# Location of drift file
driftfile /var/lib/ntp/ntp.drift
logfile /var/log/ntp.log
Verify the disk space, Inode, RAM, and CPU usage on every OpenContrail node. The total amount of used resources in the output must be maximum 90%.
To verify the disk space:
df -h
Example of system response:
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 12K 3.9G 1% /dev
tmpfs 799M 380K 798M 1% /run
/dev/vda1 48G 5.7G 41G 13% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 3.9G 12K 3.9G 1% /run/shm
none 100M 0 100M 0% /run/user
To verify the Inode usage:
df -i
Example of system response:
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 2032563 533 2032030 1% /dev
tmpfs 2037690 781 2036909 1% /run
/dev/sda1 6250496 1396006 4854490 23% /
tmpfs 2037690 304 2037386 1% /dev/shm
tmpfs 2037690 6 2037684 1% /run/lock
tmpfs 2037690 18 2037672 1% /sys/fs/cgroup
/dev/sda6 53821440 731583 53089857 2% /home
cgmfs 2037690 14 2037676 1% /run/cgmanager/fs
tmpfs 2037690 44 2037646 1% /run/user/1000
To verify RAM usage:
free -h
Example of system response:
total used free shared buffers cached
Mem: 7.8G 7.3G 501M 416K 239M 2.6G
-/+ buffers/cache: 4.5G 3.3G
Swap: 0B 0B 0B
To verify CPU usage:
cat /proc/stat | grep cpu | awk \
'{unit=100/($1+$2+$3+$4+$5+$6+$7+$8+$9+$10); print $1 "\tidle: " $5*unit "%"}'
Example of system response:
cpu idle: 94.1113%
cpu0 idle: 94.3852%
cpu1 idle: 92.851%
cpu2 idle: 94.0428%
cpu3 idle: 94.1673%
cpu4 idle: 94.2658%
cpu5 idle: 94.3526%
cpu6 idle: 94.4082%
cpu7 idle: 94.4092%
Verify MTU and the status of interfaces on all OpenContrail nodes:
ip link
Example of system response:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether ac:de:48:b0:2d:3e brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether ac:de:48:a8:7a:09 brd ff:ff:ff:ff:ff:ff
Verify whether the current number of files opened by Linux kernel is not over-limited:
cat /proc/sys/fs/file-nr
Example of system response:
17736 0 1609849