Troubleshoot NodeLocalDNS

Running NodeLocalDNS presents issues for certain Linux distributions, such as RHEL, centOS, and Rocky Linux.

Pods stuck in crash loopbackFailure

After enabling NodelocalDNS in MKE, NodelocalDNS pods may become stuck in the crash loopback.

kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns

NAME                   READY   STATUS             RESTARTS      AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
node-local-dns-cg49w   0/1     CrashLoopBackOff   5 (79s ago)   3m15s   172.31.32.61   centos7-centos-0   <none>           <none>
node-local-dns-ldjk4   0/1     CrashLoopBackOff   5 (7s ago)    3m15s   172.31.45.15   centos7-centos-1   <none>           <none>
kubectl logs -f -n kube-system -l k8s-app=node-local-dns

2024/05/14 17:34:05 [ERROR] Failed to add non-existent interface nodelocaldns: operation not supported
2024/05/14 17:34:05 [INFO] Added interface - nodelocaldns
2024/05/14 17:34:05 [ERROR] Error checking dummy device nodelocaldns - operation not supported
listen tcp 169.254.0.10:8080: bind: cannot assign requested address

The reason this happens is that the NodeLocalDNS DaemonSet creates a dummy interface during network setup, and this dummy kernel module is not loaded in RHEL or CENTOS by default. To fix the issue, load the dummy kernel module and run the following command on every node in the cluster:

sudo modprobe dummy

NodeLocalDNS containers are unable to add iptables rules

Although the NodeLocalDNS Pods switch to running state after the dummy kernel module is loaded, the Pods still fail to add iptables rules.

The error presents in the NodeLocalDNS Pods logs.

kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns

NAME                   READY   STATUS    RESTARTS        AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
node-local-dns-khfh7   1/1     Running   0               4m11s   172.31.32.61   centos7-centos-0   <none>           <none>
node-local-dns-shtqn   1/1     Running   3 (3m48s ago)   4m11s   172.31.45.15
centos7-centos-1   <none>           <none>
kubectl logs -f -n kube-system -l k8s-app=node-local-dns

Notice: The NOTRACK target is converted into CT target in rule listing and saving.
Fatal: can't open lock file /run/xtables.lock: Permission denied
[ERROR] Error checking/adding iptables rule {raw OUTPUT [-p tcp -d 10.96.0.10 --dport 8080 -j NOTRACK -m comment --comment NodeLocal DNS Cache: skip conntrack]}, error - error checking rule: exit status 4: Ignoring deprecated --wait-interval option.
Warning: Extension CT revision 0 not supported, missing kernel module?
Notice: The NOTRACK target is converted into CT target in rule listing and saving.
Fatal: can't open lock file /run/xtables.lock: Permission denied

You can fix this problem in two different ways:

  • Use audit2allow to generate SELinux policy rules for the denied operations:

     module localdnsthird 1.0;
    
    require {
         type kernel_t;
         type spc_t;
         type rpm_script_t;
         type firewalld_t;
         type container_t;
         type iptables_var_run_t;
         class process transition;
         class capability { sys_admin sys_resource };
         class system module_request;
         class file { lock open read };
    }
    
    #============= container_t ==============
    allow container_t iptables_var_run_t:file lock;
    
    #!!!! This avc is allowed in the current policy
    allow container_t iptables_var_run_t:file { open read };
    
    #!!!! This avc is allowed in the current policy
    allow container_t kernel_t:system module_request;
    
    #============= firewalld_t ==============
    
    #!!!! This avc is allowed in the current policy
    allow firewalld_t self:capability { sys_admin sys_resource };
    
    #============= spc_t ==============
    
    #!!!! This avc is allowed in the current policy
    allow spc_t rpm_script_t:process transition;
    
  • Change the SELinux mode to permissive:

    sudo setenforce 0