Kubernetes Networking Labs
Lesson 17

NodeLocal DNSCache

If video stays blank in Safari, open it on YouTube.

NodeLocal DNSCache

This lab demonstrates how NodeLocal DNSCache improves DNS performance in Kubernetes by running a DNS caching agent on each node. You'll deploy NodeLocal DNSCache manually and learn how the transparent proxy method works.

Overview

In standard Kubernetes DNS architecture, all DNS queries from pods are sent to the CoreDNS service (kube-dns). NodeLocal DNSCache addresses several challenges with this architecture:

Standard Kubernetes DNS Architecture

Latency Improvement

With the current DNS architecture, pods with the highest DNS QPS may have to reach out to a different node if there is no local kube-dns instance. Having a local cache helps improve latency in such scenarios by serving responses from the same node.

Conntrack Race Conditions

Skipping iptables DNAT and connection tracking helps reduce conntrack races and avoids UDP DNS entries filling up the conntrack table. This is a common cause of intermittent DNS failures in busy clusters.

TCP Connection Handling

Connections from the local caching agent to kube-dns are upgraded to TCP. TCP conntrack entries are removed on connection close, in contrast with UDP entries that have to timeout (default nf_conntrack_udp_timeout is 30 seconds). This reduces conntrack table pressure.

Reduced Tail Latency

Upgrading DNS queries from UDP to TCP reduces tail latency attributed to dropped UDP packets and DNS timeouts. Without NodeLocal DNSCache, timeouts can be up to 30 seconds (3 retries + 10s timeout). Since the nodelocal cache listens for UDP DNS queries, applications don't need to be changed.

Node-Level Metrics

NodeLocal DNSCache provides metrics and visibility into DNS requests at a node level, making it easier to debug DNS issues and monitor performance per node.

Negative Caching

Negative caching (NXDOMAIN responses) can be enabled, reducing the number of queries to kube-dns for non-existent domains.

Summary

BenefitDescription
Local CacheDNS responses served from the same node
No DNATBypasses kube-proxy's iptables DNAT rules
No ConntrackNOTRACK rules prevent conntrack table issues
TCP UpstreamUses TCP to CoreDNS (faster conntrack cleanup)
UDP to PodsPods still use UDP (no app changes needed)
Per-Node MetricsPrometheus metrics per node
Negative CachingNXDOMAIN responses cached locally

Reference: For more details, see the official Kubernetes documentation: Using NodeLocal DNSCache in Kubernetes Clusters

πŸ’‘ Note: High DNS QPS usually isn't caused by "bad apps," but by concurrency, scale, retries, and startup synchronizationβ€”NodeLocal DNSCache exists because Kubernetes amplifies all of these at once.

How NodeLocal DNSCache Works (iptables Mode)

When kube-proxy runs in iptables mode, NodeLocal DNSCache uses the transparent proxy method:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Without NodeLocal DNSCache                                                  β”‚
β”‚                                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”      10.96.0.10:53       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚  β”‚   Pod   β”‚ ────────────────────────►│   CoreDNS   β”‚  (remote pod)         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   (network traversal)    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                                                                              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  With NodeLocal DNSCache (iptables mode)                                     β”‚
β”‚                                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”      10.96.0.10:53       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β”‚
β”‚  β”‚   Pod   β”‚ ────────────────────────►│ NodeLocal DNS    β”‚  (same node!)    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   (routed locally!)      β”‚ Cache            β”‚                  β”‚
β”‚                                       β”‚                  β”‚                  β”‚
β”‚                 Pod's resolv.conf     β”‚ Binds locally:   β”‚  cache miss      β”‚
β”‚                 still shows:          β”‚ β€’ 10.96.0.10  ◄──┼─────────────►    β”‚
β”‚                 nameserver 10.96.0.10 β”‚ β€’ 169.254.20.10  β”‚    CoreDNS       β”‚
β”‚                                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How Transparent Interception Works

  1. Virtual Interface Creation: NodeLocal DNSCache creates a dummy network interface (nodelocaldns) on each node

  2. IP Binding: The agent binds the kube-dns ClusterIP (10.96.0.10) directly to this local interface

  3. Local Routing: When a pod sends a DNS query to 10.96.0.10, the Linux kernel finds this IP bound to a local interface and routes it locally

  4. Transparent to Pods: Pods don't need any configuration changes - they still use 10.96.0.10 as their nameserver

  5. Upstream Service: A separate kube-dns-upstream service is created for cache misses

NodeLocal DNSCache Architecture

Lab Setup

To setup the lab for this module Lab setup

The lab folder is - /containerlab/17-nodelocal-dnscache

Manifest Files

ContainerLab

FileDescription
k01.clab.yamlContainerLab topology defining the Kind cluster

Kind Cluster

FileDescription
k01-no-cni.yamlKind cluster configuration without CNI

Calico CNI

FileDescription
calico-cni-config/custom-resources.yamlCustom Calico Installation resource with IPAM configuration

NodeLocal DNSCache

FileDescription
nodelocal-dnscache/nodelocaldns.yamlNodeLocal DNSCache DaemonSet manifest

Tools

FileDescription
tools/dns-test-pod.yamlDNS test pod for cache validation

Deployment

The deploy.sh script deploys a 3-node Kind cluster with Calico CNI. NodeLocal DNSCache is NOT installed - you will deploy it manually as part of this lab.

cd containerlab/17-nodelocal-dnscache
chmod +x deploy.sh
./deploy.sh

Lab Exercises

Note

The outputs in this section will be different in your lab. When running the commands given in this section, make sure you replace IP addresses, interface names, and node names as per your lab.

1. Verify the Lab Setup

# Set kubeconfig
export KUBECONFIG=$(pwd)/k01.kubeconfig

# Check nodes
kubectl get nodes -o wide

# Check CoreDNS pods
kubectl get pods -n kube-system -l k8s-app=kube-dns -o wide

Expected output:

NAME                STATUS   ROLES           AGE   VERSION
k01-control-plane   Ready    control-plane   10m   v1.32.2
k01-worker          Ready    <none>          10m   v1.32.2
k01-worker2         Ready    <none>          10m   v1.32.2

2. Verify Current DNS Configuration (Before NodeLocal DNSCache)

Check the kube-dns service:

kubectl get svc kube-dns -n kube-system

Expected output:

NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   10m

Check how pods resolve DNS:

kubectl exec dns-test -- cat /etc/resolv.conf

Expected output:

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5

Currently, all DNS queries go to 10.96.0.10, which is handled by CoreDNS pods via kube-proxy.

3. Copy the NodeLocal DNSCache Manifest

Now you'll deploy NodeLocal DNSCache following the official Kubernetes documentation.

The manifest (nodelocal-dnscache/nodelocaldns.yaml) is provided in the nodelocal-dnscache/ folder. Copy it to your working directory:

cp nodelocal-dnscache/nodelocaldns.yaml .

Note: The original manifest can also be downloaded from: https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml

4. Set the Configuration Variables

The manifest contains placeholder variables. Some need manual replacement, others are auto-populated:

VariableReplaced ByDescription
__PILLAR__LOCAL__DNS__sed (manual)Local IP address for the cache (169.254.20.10)
__PILLAR__DNS__DOMAIN__sed (manual)Cluster DNS domain (cluster.local)
__PILLAR__DNS__SERVER__sed (manual)kube-dns ClusterIP (10.96.0.10)
__PILLAR__CLUSTER__DNS__Pod (auto)Upstream DNS for cluster.local queries
__PILLAR__UPSTREAM__SERVERS__Pod (auto)Upstream DNS for external queries
# Get the kube-dns ClusterIP
kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP})
echo "kube-dns ClusterIP: $kubedns"

# Set the cluster domain
domain="cluster.local"

# Set the local DNS IP (link-local address)
localdns="169.254.20.10"

5. Configure the Manifest for iptables Mode

Since Kind uses kube-proxy in iptables mode, we configure NodeLocal DNSCache to bind both the link-local IP AND the kube-dns ClusterIP:

sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__/$kubedns/g" nodelocaldns.yaml

According to the Kubernetes documentation:

In iptables mode, the node-local-dns pods listen on both the kube-dns service IP as well as <node-local-address>, so pods can look up DNS records using either IP address.

Auto-populated variables: The __PILLAR__CLUSTER__DNS__ and __PILLAR__UPSTREAM__SERVERS__ placeholders are automatically populated by the node-local-dns pods at startup. You do NOT need to replace these manually - they will remain as placeholders in the YAML file and the pod will substitute them when it starts.

6. Understanding the Cache Configuration

Before deploying, let's examine the cache settings in the Corefile:

cluster.local:53 {
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    ...
}
.:53 {
    errors
    cache 30
    ...
}

6.1 Cache Settings Explained

ZoneSettingValueMeaning
cluster.localsuccess 9984 309984 entries, 30s TTLCache up to 9984 successful responses for 30 seconds
cluster.localdenial 9984 59984 entries, 5s TTLCache up to 9984 NXDOMAIN responses for 5 seconds
. (external)cache 30default entries, 30s TTLCache external DNS responses for 30 seconds

6.2 Why Different TTLs?

Response TypeTTLReason
Success (30s)LongerStable responses can be cached longer
Denial/NXDOMAIN (5s)ShorterFailed lookups might succeed soon (new service created)

6.3 Cache Size (9984 entries)

  • Each cache can hold up to 9984 entries
  • Default CoreDNS cache uses ~30MB when full
  • Separate limits for success and denial responses

6.4 Memory Impact

From the Kubernetes documentation:

The default cache size is 10000 entries, which uses about 30 MB when completely filled.

If you need to reduce memory usage, you can lower the cache size:

cache {
    success 1000 30    # Reduce to 1000 entries
    denial 1000 5
}

7. Understanding DNS Forwarding (Cluster vs External)

NodeLocal DNSCache handles cluster and external DNS queries differently. You can see this in the pod logs:

[INFO] Using config file:
cluster.local:53 {
    ...
    forward . 10.96.248.93 {        # <-- kube-dns-upstream (CoreDNS)
            force_tcp
    }
}
.:53 {
    ...
    forward . /etc/resolv.conf      # <-- Node's DNS servers
}

7.1 Forwarding Rules

ZoneForward ToProtocolDescription
cluster.localkube-dns-upstream (10.96.x.x)TCPCluster services β†’ CoreDNS
in-addr.arpakube-dns-upstreamTCPReverse DNS for cluster IPs
ip6.arpakube-dns-upstreamTCPReverse DNS for IPv6
. (external)/etc/resolv.confUDP/TCPExternal domains β†’ Node's DNS

7.2 Why TCP for Cluster Queries?

Cluster queries use force_tcp because:

  • TCP conntrack entries are removed on connection close
  • UDP entries must timeout (default 30 seconds)
  • Reduces conntrack table pressure

7.3 External DNS Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  External DNS Query (e.g., google.com)                                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚  Pod ──► NodeLocal DNSCache ──► /etc/resolv.conf ──► External DNS Server    β”‚
β”‚          (169.254.20.10)        (Node's DNS)         (e.g., 8.8.8.8)        β”‚
β”‚                                                                              β”‚
β”‚  The node's /etc/resolv.conf typically contains:                            β”‚
β”‚  - Cloud provider DNS (AWS: 169.254.169.253, GCP: 169.254.169.254)          β”‚
β”‚  - Or custom upstream DNS servers                                           β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

7.4 Cluster DNS Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Cluster DNS Query (e.g., kubernetes.default.svc.cluster.local)             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚  Pod ──► NodeLocal DNSCache ──► kube-dns-upstream ──► CoreDNS Pods          β”‚
β”‚          (10.96.0.10)           (TCP, force_tcp)      (192.168.x.x)         β”‚
β”‚                                                                              β”‚
β”‚  Cache miss queries are upgraded to TCP for better conntrack handling        β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

NodeLocal DNSCache Setup

8. Deploy NodeLocal DNSCache

kubectl create -f nodelocaldns.yaml

Expected output:

serviceaccount/node-local-dns created
service/kube-dns-upstream created
configmap/node-local-dns created
daemonset.apps/node-local-dns created

9. Wait for NodeLocal DNSCache Pods to be Ready

kubectl get pods -n kube-system -l k8s-app=node-local-dns -o wide -w

Wait until all pods show Running status (press Ctrl+C to exit watch):

Expected output:

NAME                   READY   STATUS    RESTARTS   AGE   IP            NODE
node-local-dns-xxxxx   1/1     Running   0          30s   192.168.1.2   k01-control-plane
node-local-dns-xxxxx   1/1     Running   0          30s   192.168.1.3   k01-worker
node-local-dns-xxxxx   1/1     Running   0          30s   192.168.1.4   k01-worker2

A NodeLocal DNSCache pod runs on every node in the cluster.

10. Verify the kube-dns-upstream Service

NodeLocal DNSCache creates a new service for reaching CoreDNS on cache misses:

kubectl get svc -n kube-system | grep dns

Expected output:

kube-dns            ClusterIP   10.96.0.10    <none>        53/UDP,53/TCP,9153/TCP   15m
kube-dns-upstream   ClusterIP   10.96.x.x     <none>        53/UDP,53/TCP            2m
ServicePurpose
kube-dnsOriginal DNS service (ClusterIP now bound locally on each node)
kube-dns-upstreamUsed by NodeLocal DNSCache to reach CoreDNS for cache misses

11. Verify the Local Interface Binding

This is the key to transparent interception - check that the kube-dns ClusterIP is bound to a local interface.

First, identify which node the dns-test pod is running on (we'll use this node for all verification):

# Find which node the dns-test pod is on
NODE=$(kubectl get pod dns-test -o jsonpath='{.spec.nodeName}')
echo "dns-test pod is running on node: $NODE"

# Check the nodelocaldns interface on that node
docker exec $NODE ip addr show nodelocaldns

Expected output:

X: nodelocaldns: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 169.254.20.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever
    inet 10.96.0.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever

This is the magic! The nodelocaldns interface has both IPs bound:

  • 169.254.20.10/32 - Link-local IP
  • 10.96.0.10/32 - The kube-dns ClusterIP!

When a pod queries 10.96.0.10, the kernel finds this IP on a local interface and routes it locally to NodeLocal DNSCache.

12. Verify Pod DNS Still Works (Transparently!)

kubectl exec dns-test -- cat /etc/resolv.conf

Expected output:

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5

The pod configuration is unchanged - it still uses 10.96.0.10. The difference is that queries are now handled locally!

kubectl exec dns-test -- nslookup kubernetes.default

Expected output:

Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

13. The Order of Operations

This section explains exactly how DNS queries are intercepted by NodeLocal DNSCache. Understanding this flow is critical to understanding how NodeLocal DNSCache works.

When a pod sends a DNS query to the kube-dns ClusterIP, here's what happens:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DNS Query - Order of Operations                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚  Step 1: PACKET INGRESS                                                     β”‚
β”‚  ─────────────────────                                                      β”‚
β”‚  Pod sends DNS query to kube-dns ClusterIP (10.96.0.10:53)                  β”‚
β”‚  Packet enters the node's network namespace via veth pair                   β”‚
β”‚                         β”‚                                                    β”‚
β”‚                         β–Ό                                                    β”‚
β”‚  Step 2: RAW TABLE (FIRST!)                                                 β”‚
β”‚  ─────────────────────────                                                  β”‚
β”‚  The packet hits the raw table PREROUTING chain first                       β”‚
β”‚  NodeLocal DNSCache's NOTRACK rule matches β†’ packet marked as UNTRACKED     β”‚
β”‚                         β”‚                                                    β”‚
β”‚                         β–Ό                                                    β”‚
β”‚  Step 3: NAT TABLE (SKIPPED!)                                               β”‚
β”‚  ───────────────────────────                                                β”‚
β”‚  Packet reaches nat PREROUTING where kube-proxy's DNAT rules exist          β”‚
β”‚  BUT: DNAT requires conntrack to track translations                         β”‚
β”‚  Since packet is UNTRACKED, DNAT cannot operate β†’ destination unchanged     β”‚
β”‚                         β”‚                                                    β”‚
β”‚                         β–Ό                                                    β”‚
β”‚  Step 4: ROUTING DECISION                                                   β”‚
β”‚  ────────────────────────                                                   β”‚
β”‚  Kernel checks: "Where is 10.96.0.10?"                                      β”‚
β”‚  NodeLocal DNSCache bound 10.96.0.10 to local nodelocaldns interface        β”‚
β”‚  Kernel determines: "This is a LOCAL destination"                           β”‚
β”‚                         β”‚                                                    β”‚
β”‚                         β–Ό                                                    β”‚
β”‚  Step 5: LOCAL DELIVERY                                                     β”‚
β”‚  ──────────────────────                                                     β”‚
β”‚  Packet goes through filter INPUT chain (ACCEPT rules match)                β”‚
β”‚  Packet delivered to NodeLocal DNSCache listening on 10.96.0.10:53          β”‚
β”‚                         β”‚                                                    β”‚
β”‚            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                      β”‚
β”‚            β–Ό                         β–Ό                                       β”‚
β”‚     Cache HIT               Cache MISS                                      β”‚
β”‚     Return response         Forward to kube-dns-upstream                    β”‚
β”‚     immediately             β†’ CoreDNS pods                                  β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The diagram below is a good way to visualize packets traversing through the different iptables chains in the Linux kernel:

iptables Chains

Image source: Understanding the policy enforcement options with Calico

14. Validate Step 2: raw Table NOTRACK Rules

The raw table is processed first before any other table. NodeLocal DNSCache adds NOTRACK rules here:

docker exec $NODE iptables -t raw -S | grep -E "169.254.20.10|10.96.0.10"

Expected output:

-A PREROUTING -d 169.254.20.10/32 -p udp -m udp --dport 53 -j NOTRACK
-A PREROUTING -d 169.254.20.10/32 -p tcp -m tcp --dport 53 -j NOTRACK
-A PREROUTING -d 10.96.0.10/32 -p udp -m udp --dport 53 -j NOTRACK
-A PREROUTING -d 10.96.0.10/32 -p tcp -m tcp --dport 53 -j NOTRACK
-A OUTPUT -d 169.254.20.10/32 -p udp -m udp --dport 53 -j NOTRACK
-A OUTPUT -d 169.254.20.10/32 -p tcp -m tcp --dport 53 -j NOTRACK
-A OUTPUT -d 10.96.0.10/32 -p udp -m udp --dport 53 -j NOTRACK
-A OUTPUT -d 10.96.0.10/32 -p tcp -m tcp --dport 53 -j NOTRACK

14.1 Why NOTRACK is Critical

EffectExplanation
Marks packet as UNTRACKEDConntrack subsystem ignores this packet
DNAT cannot operateNAT targets require conntrack to track translations
Destination stays unchangedPacket keeps original destination (10.96.0.10)
Prevents conntrack racesEliminates UDP DNS failures from conntrack table races

15. Validate Step 3: Kube-proxy DNAT Rules Still Exist

Kube-proxy's DNAT rules for kube-dns are still present, but they won't affect UNTRACKED packets:

docker exec $NODE iptables -t nat -S | grep "kube-dns:dns cluster IP" | head -2

Expected output:

-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4

These DNAT rules would normally redirect traffic from 10.96.0.10 to CoreDNS pod IPs. However, because the DNS packets are marked UNTRACKED in the raw table, the DNAT target cannot function (it requires conntrack to track the address translation). The packet passes through with its destination unchanged.

16. Validate Step 4: Local Route for kube-dns IP

Verify that the kube-dns ClusterIP is bound to the local nodelocaldns interface:

docker exec $NODE ip addr show nodelocaldns

Expected output:

X: nodelocaldns: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 169.254.20.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever
    inet 10.96.0.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever

Verify the kernel considers this IP local:

docker exec $NODE ip route get 10.96.0.10

Expected output:

local 10.96.0.10 dev lo src 10.96.0.10 uid 0
    cache <local>

The local keyword confirms the kernel routes packets destined for 10.96.0.10 to the local host.

17. Validate Step 5: Filter INPUT Rules

The filter table INPUT chain allows DNS traffic to the local cache:

docker exec $NODE iptables -S INPUT | grep -E "169.254.20.10|10.96.0.10"

Expected output:

-A INPUT -d 169.254.20.10/32 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -d 169.254.20.10/32 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -d 10.96.0.10/32 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -d 10.96.0.10/32 -p tcp -m tcp --dport 53 -j ACCEPT

These ACCEPT rules ensure DNS traffic reaches the NodeLocal DNSCache process.

18. Test DNS Resolution

kubectl exec dns-test -- nslookup kubernetes.default

Expected output:

Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

The pod queries 10.96.0.10, which is now handled locally by NodeLocal DNSCache!

19. Verify Cache Metrics

Check the cache statistics to confirm NodeLocal DNSCache is working.

First, identify which node the dns-test pod is running on, then find the node-local-dns pod on the same node:

# Find which node the dns-test pod is on
TEST_NODE=$(kubectl get pod dns-test -o jsonpath='{.spec.nodeName}')
echo "dns-test pod is running on: $TEST_NODE"

# Find the node-local-dns pod on the SAME node
NODELOCAL_POD=$(kubectl get pods -n kube-system -l k8s-app=node-local-dns --field-selector spec.nodeName=$TEST_NODE -o jsonpath='{.items[0].metadata.name}')
echo "node-local-dns pod on same node: $NODELOCAL_POD"

# Set up port-forwarding to that specific pod
kubectl port-forward -n kube-system $NODELOCAL_POD 9253:9253 &
sleep 2

View All Cache Metrics

curl -s http://127.0.0.1:9253/metrics | grep -E "coredns_cache_(hits|misses|requests).*10.96.0.10.*cluster.local"

Expected output:

coredns_cache_hits_total{server="dns://10.96.0.10:53",type="denial",view="",zones="cluster.local."} 12
coredns_cache_hits_total{server="dns://10.96.0.10:53",type="success",view="",zones="cluster.local."} 9
coredns_cache_misses_total{server="dns://10.96.0.10:53",view="",zones="cluster.local."} 35
coredns_cache_requests_total{server="dns://10.96.0.10:53",view="",zones="cluster.local."} 56

Seeing requests to 10.96.0.10:53 confirms the transparent interception is working!

20. Test Cache Hits in Action

To see cache hits increasing, make repeated DNS queries:

# Check current cache hits
curl -s http://127.0.0.1:9253/metrics | grep "coredns_cache_hits_total.*success.*cluster.local"

# Make a DNS query
kubectl exec dns-test -- nslookup kubernetes.default

# Check cache hits again - should increase if it was a cache hit!
curl -s http://127.0.0.1:9253/metrics | grep "coredns_cache_hits_total.*success.*cluster.local"

The more you query the same domain, the more cache hits you'll see!

21. View Current Cache Entries

See how many DNS responses are currently cached:

curl -s http://127.0.0.1:9253/metrics | grep "coredns_cache_entries.*10.96.0.10"

Expected output:

coredns_cache_entries{server="dns://10.96.0.10:53",type="denial",view="",zones="cluster.local."} 3
coredns_cache_entries{server="dns://10.96.0.10:53",type="success",view="",zones="cluster.local."} 1

Stop port-forward when done:

kill %1

Note: CoreDNS does not expose individual cached domain names - only aggregate statistics.

22. Enable Logging in the ConfigMap

You can enable logging in NodeLocal DNSCache to see individual DNS queries. This is useful for debugging but should be disabled in production due to log volume.

Edit the node-local-dns ConfigMap to add the log directive:

kubectl edit configmap node-local-dns -n kube-system

Add log after each zone declaration in the Corefile:

apiVersion: v1
data:
  Corefile: |
    cluster.local:53 {
        log                    # <-- Add this line
        errors
        cache {
            success 9984 30
            denial 9984 5
        }
        reload
        loop
        bind 169.254.20.10 10.96.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
        }
        prometheus :9253
        health 169.254.20.10:8080
    }
    in-addr.arpa:53 {
        log                    # <-- Add this line
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.96.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
        }
        prometheus :9253
    }
    ip6.arpa:53 {
        log                    # <-- Add this line
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.96.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
        }
        prometheus :9253
    }
    .:53 {
        log                    # <-- Add this line
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.96.0.10
        forward . __PILLAR__UPSTREAM__SERVERS__ {
            force_tcp
        }
        prometheus :9253
    }

Save and exit the editor (:wq in vim).

23. Restart NodeLocal DNS Pods

The pods need to be restarted to pick up the ConfigMap changes:

kubectl rollout restart daemonset node-local-dns -n kube-system
kubectl rollout status daemonset node-local-dns -n kube-system

24. View DNS Query Logs

Now make some DNS queries and watch the logs:

# Get the node-local-dns pod on the same node as dns-test
TEST_NODE=$(kubectl get pod dns-test -o jsonpath='{.spec.nodeName}')
NODELOCAL_POD=$(kubectl get pods -n kube-system -l k8s-app=node-local-dns --field-selector spec.nodeName=$TEST_NODE -o jsonpath='{.items[0].metadata.name}')

# Watch the logs
kubectl logs -n kube-system $NODELOCAL_POD -f &

# Make a DNS query
kubectl exec dns-test -- nslookup kubernetes.default
kubectl exec dns-test -- nslookup google.com

Expected output:

[INFO] 192.168.82.65:54312 - 42953 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000232824s
[INFO] 192.168.82.65:38291 - 18276 "A IN google.com. udp 28 false 512" NOERROR qr,rd,ra 54 0.023451s

24.1 Understanding the Log Format

FieldExampleMeaning
Client IP192.168.82.65:54312Pod IP and source port
Query ID42953DNS query identifier
QueryA IN kubernetes.default...Record type and domain
Protocoludp 54UDP, 54 bytes
ResponseNOERRORSuccessful resolution
Duration0.000232824sTime to resolve

Stop watching logs:

kill %1

25. Disable Logging (Recommended for Production)

To disable logging, edit the ConfigMap and remove the log lines:

kubectl edit configmap node-local-dns -n kube-system

Remove all log lines from the Corefile, then restart the pods:

kubectl rollout restart daemonset node-local-dns -n kube-system

Warning: Keep logging disabled in production - it generates significant log volume and can impact performance.

Summary

Order of Operations Recap

StepWhat HappensValidation Command
1. Packet IngressPod sends DNS to 10.96.0.10:53kubectl exec dns-test -- nslookup kubernetes
2. raw TableNOTRACK marks packet as untrackediptables -t raw -S | grep 10.96.0.10
3. nat TableDNAT skipped (requires conntrack)iptables -t nat -S | grep kube-dns
4. RoutingKernel sees 10.96.0.10 is localip route get 10.96.0.10
5. DeliveryPacket delivered to local cacheip addr show nodelocaldns

What You Learned

In this lab, you:

  1. Deployed NodeLocal DNSCache manually following the official Kubernetes documentation
  2. Verified the transparent proxy - pods use 10.96.0.10 with no configuration changes
  3. Understood the packet flow - how NOTRACK + local IP binding intercepts DNS
  4. Explored iptables rules - raw table NOTRACK and filter table ACCEPT rules

Key Takeaways

ConceptExplanation
raw table NOTRACKMarks DNS packets as untracked, preventing DNAT from operating
Local IP bindingkube-dns ClusterIP bound to nodelocaldns interface
Routing decisionKernel routes to local interface instead of network
Transparent to podsNo pod configuration changes needed
kube-dns-upstreamService for cache misses to reach CoreDNS

Why NOTRACK is Essential

Without NOTRACK, kube-proxy's DNAT rules would redirect DNS queries to CoreDNS pods. The NOTRACK target in the raw table:

  1. Runs first - raw table is processed before nat table
  2. Disables conntrack - packet marked as untracked
  3. Breaks DNAT - NAT targets require conntrack to function
  4. Preserves destination - packet keeps original destination (10.96.0.10)

This allows the routing decision to see 10.96.0.10 (bound locally) rather than a CoreDNS pod IP.

iptables Mode vs IPVS Mode

Aspectiptables Mode (this lab)IPVS Mode
Local bindingBoth 169.254.20.10 AND 10.96.0.10Only 169.254.20.10
Pod configurationNo changes neededMust use --cluster-dns=169.254.20.10
WhyCan bind kube-dns IP locallyIPVS already uses kube-dns IP for load balancing

Lab Cleanup

To cleanup the lab follow steps in Lab cleanup

Or run:

chmod +x destroy.sh
./destroy.sh

Explore more lessons

Scroll to see more

View all lessons β†’