DNS best practices

更新时间:
复制 MD 格式

DNS is one of the most critical foundational services in a Kubernetes cluster. Improper client-side configuration or a large cluster scale can cause DNS resolution to time out or fail. This topic describes best practices for DNS in Kubernetes clusters to help you avoid these issues.

Important notes

This topic does not apply to managed CoreDNS or ACK clusters with Auto Mode enabled. Managed CoreDNS automatically scales based on workload, so manual adjustments are unnecessary.

Contents

DNS best practices cover both the client side and the server side:

For more information about CoreDNS, see the official CoreDNS documentation.

Optimize domain name resolution requests

DNS domain name resolution is one of the most frequent network activities in Kubernetes. Many of these requests can be optimized or avoided. You can optimize domain name resolution requests in the following ways:

  • (Recommended) Use connection pools. If a containerized application frequently accesses another service, use a connection pool. Connection pools cache upstream service connections in memory, which avoids the overhead of DNS resolution and TCP connection setup for each access.

  • Use asynchronous or long polling modes to obtain the IP address that corresponds to a DNS domain name.

  • Use DNS caching:

    • (Recommended) If your application cannot be modified to use connection pools, consider caching DNS resolution results on the application side. For more information, see Use NodeLocal DNSCache.

    • If NodeLocal DNSCache is unavailable, you can cache DNS queries inside the container using Name Service Cache Daemon (NSCD). For more information, see Use NSCD in Kubernetes clusters.

  • Optimize the resolv.conf file: The ndots and search parameters in the resolv.conf file affect the efficiency of domain name resolution based on how you write domain names in container configurations. For more information about how the ndots and search parameters work, see DNS policy configuration and domain name resolution.

  • Optimize the domain name configuration. When a containerized application accesses a domain name, configure the domain name as follows to minimize resolution attempts and reduce resolution time:

    • For pods that access a Service in the same namespace, use <service-name>, where service-name is the name of the Service.

    • For pods that access a Service across namespaces, use <service-name>.<namespace-name>, where namespace-name is the namespace of the Service.

    • When you access external domains, use fully qualified domain names (FQDNs). Append a trailing dot (.) to common domain names to specify them as absolute addresses. This practice avoids multiple invalid searches caused by search domain concatenation. For example, when you access www.aliyun.com, use the FQDN www.aliyun.com..

      • In clusters of version 1.33 or later, you can configure the search domain as a single "." (see related issue: 125883) to achieve a similar effect:

        dnsPolicy: None
        dnsConfig:
          nameservers: ["192.168.0.10"]  ## Replace 192.168.0.10 with the actual CoreDNS service clusterIP
          searches:
          - .
          - default.svc.cluster.local  ## Replace "default" with your namespace name
          - svc.cluster.local
          - cluster.local

        After you apply this configuration, the /etc/resolv.conf file in the pod appears as follows:

        search . default.svc.cluster.local svc.cluster.local cluster.local
        nameserver 192.168.0.10

        The "." as the first search domain ensures that all domain requests are treated as FQDNs and resolved directly without unnecessary search attempts.

        Important

        Note that this configuration requires you to set dnsPolicy to None to take effect.

        Complete workload example

        apiVersion: apps/v1
        kind: Deployment
        metadata:
          labels:
            app: nginx
          name: nginx
          namespace: default
        spec:
          progressDeadlineSeconds: 600
          replicas: 3
          revisionHistoryLimit: 10
          selector:
            matchLabels:
              app: nginx
          strategy:
            rollingUpdate:
              maxSurge: 25%
              maxUnavailable: 25%
            type: RollingUpdate
          template:
            metadata:
              labels:
                app: nginx
            spec:
              containers:
              - image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
                imagePullPolicy: Always
                name: nginx
                resources: {}
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
              dnsPolicy: None
              dnsConfig:
                nameservers: ["192.168.0.10"]  ## Replace 192.168.0.10 with the actual CoreDNS service clusterIP
                searches:
                - .
                - default.svc.cluster.local
                - svc.cluster.local
                - cluster.local
              hostname: nginx
              restartPolicy: Always
              schedulerName: default-scheduler
              securityContext: {}
              subdomain: subdomain
              terminationGracePeriodSeconds: 30

Understand DNS configuration in containers

  • Different DNS resolvers may behave slightly differently due to implementation differences. You might observe cases where dig <domain> resolves successfully but ping <domain> fails.

  • Avoid using Alpine base images. The musl libc library in Alpine container images differs from the standard glibc and can cause issues such as the following:

    • Alpine 3.18 and earlier versions do not support the tc command falling back to the TCP protocol.

    • Alpine versions 3.3 and earlier do not support the search parameter or search domains, which prevents service discovery.

    • Concurrent queries to multiple DNS servers that are configured in /etc/resolv.conf can invalidate NodeLocal DNSCache optimizations.

    • Concurrent A and AAAA record queries that use the same socket can trigger conntrack source port conflicts on older kernels, which causes packet loss.

    For more information, see musl libc.

  • If you use Go applications, make sure that you understand the differences between CGO and Pure Go DNS resolver implementations.

Avoid probabilistic DNS resolution timeouts caused by IPVS defects

If you use IPVS as the kube-proxy load balancing mode, you may encounter intermittent DNS resolution timeouts during a CoreDNS scale-in or restart. This issue is caused by a defect in the Linux kernel. For more information, see IPVS.

You can reduce the impact of IPVS defects using one of the following methods:

Use NodeLocal DNSCache

In some scenarios, CoreDNS may encounter the following issues:

  • Rarely, concurrent A and AAAA queries may cause packet loss, which leads to DNS resolution failures.

  • Full conntrack tables on nodes may cause packet loss, which results in DNS resolution failures.

To improve the stability and performance of the DNS service in your cluster, you can install the NodeLocal DNSCache component. This component runs a DNS cache on each node in the cluster to improve DNS performance. For more information about NodeLocal DNSCache and how to deploy it in ACK clusters, see Use the NodeLocal DNSCache component.

Important

After you install NodeLocal DNSCache, you must inject the DNS cache configuration into your pods. You can run the following command to add a label to a namespace. New pods that are created in this namespace automatically receive the DNS cache configuration. For information about other injection methods, see the document linked in the previous paragraph.

kubectl label namespace default node-local-dns-injection=enabled

Use an appropriate CoreDNS version

CoreDNS maintains good backward compatibility with different Kubernetes versions. We recommend that you keep CoreDNS at a recent, stable version. The Component Management page in the ACK console provides features to install, upgrade, and configure CoreDNS. You can monitor the status of your components on the Component Management page. If an upgrade is available for CoreDNS, perform the upgrade during off-peak hours.

CoreDNS versions earlier than v1.7.0 have known risks, including but not limited to the following:

The recommended minimum CoreDNS version varies based on the Kubernetes cluster version, as shown in the following table:

Cluster version

Recommended minimum CoreDNS version

Below 1.14.8

v1.6.2 (no longer maintained)

1.14.8 and later, below 1.20.4

v1.7.0.0-f59c03d-aliyun

1.20.4 and later, below 1.21.0

v1.8.4.1-3a376cc-aliyun

1.21.0 and later

v1.11.3.2-f57ea7ed6-aliyun

Monitor CoreDNS runtime status

Monitoring metrics

CoreDNS exposes health metrics, such as resolution results, through a standard Prometheus interface. This helps you detect anomalies in CoreDNS or upstream DNS servers.

Prometheus for ACK includes built-in CoreDNS monitoring metrics and alerting rules. You can enable the Prometheus and Dashboard features in the Container Service for Kubernetes console. For more information, see CoreDNS component monitoring.

If you run a self-managed Prometheus instance for your Kubernetes cluster, you can observe the relevant metrics and set alerts for critical ones. For more information, see the official CoreDNS Prometheus documentation.

Operational logs

When DNS anomalies occur, you can use CoreDNS logs to quickly diagnose the root causes. You can enable CoreDNS domain resolution logs and SLS log collection. For more information, see Analyze and monitor CoreDNS logs.

Kubernetes event delivery

In CoreDNS v1.9.3.6-32932850-aliyun and later, you can enable the k8s_event plugin to deliver critical CoreDNS logs as Kubernetes events to the Event Hub. For more information about the k8s_event plugin, see k8s_event.

Newly deployed CoreDNS instances have this feature enabled by default. If you are upgrading from an older version of CoreDNS, you must manually modify the configuration file to enable the feature.

  1. Run the following command to open the CoreDNS configuration file.

    kubectl -n kube-system edit configmap/coredns
  2. Add the kubeAPI and k8s_event plugins.

    apiVersion: v1
    data:
      Corefile: |
        .:53 {
            errors
            health {
                lameduck 15s
            }
            // Begin addition (ignore other differences).
            kubeapi
            k8s_event {
              level info error warning // Deliver critical logs with info, error, or warning levels.
            }
            // End addition.
            kubernetes cluster.local in-addr.arpa ip6.arpa {
                pods verified
                fallthrough in-addr.arpa ip6.arpa
            }
            // Omitted below.
        }
  3. Check the CoreDNS pod status and logs. If the logs contain the word reload, the modification is successful.

Ensure CoreDNS high availability

CoreDNS serves as the authoritative DNS for the cluster. A CoreDNS failure can cause internal Service access to fail, which can disrupt large portions of your business. You can ensure CoreDNS high availability using the following measures:

Assess CoreDNS component pressure

You can run DNS stress tests in your cluster to evaluate component pressure. Many open-source tools, such as DNSPerf, can help with this. If you cannot accurately assess the DNS pressure, follow these recommendations:

  • Always deploy at least two CoreDNS pods, with each pod having resource limits of at least 1 CPU core and 1 GB of memory.

  • The QPS capacity of CoreDNS is directly correlated with CPU usage. With NodeLocal DNSCache, each CPU core can support more than 10,000 QPS. The DNS QPS requirements of business workloads vary significantly. You should monitor the peak CPU usage of each CoreDNS pod. If the CPU usage exceeds one core during peak business periods, scale out the CoreDNS replicas. If the peak CPU usage is unknown, as a conservative measure, deploy one CoreDNS pod for every eight nodes in the cluster.

Adjust the number of CoreDNS pods

The number of CoreDNS pods directly determines the available compute resources. You can adjust this number based on your assessment.

Important

Because UDP does not have retransmission mechanisms, a CoreDNS pod scale-in or restart may cause cluster-wide DNS resolution timeouts or anomalies that last for up to five minutes if IPVS UDP defects exist on the cluster nodes. For solutions to IPVS-related resolution issues, see Troubleshoot DNS resolution anomalies.

  • Automatically adjust the number of pods based on recommended policies

    You can deploy the following dns-autoscaler. It automatically adjusts the number of CoreDNS pods in real time based on the recommended ratio of one pod for every eight cluster nodes. The formula for the number of replicas is `replicas = max(ceil(cores × 1/coresPerReplica), ceil(nodes × 1/nodesPerReplica))`, which is constrained by the max and min limits.

    dns-autoscaler

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: dns-autoscaler
      namespace: kube-system
      labels:
        k8s-app: dns-autoscaler
    spec:
      selector:
        matchLabels:
          k8s-app: dns-autoscaler
      template:
        metadata:
          labels:
            k8s-app: dns-autoscaler
        spec:
          serviceAccountName: admin
          containers:
          - name: autoscaler
            image: registry.cn-hangzhou.aliyuncs.com/acs/cluster-proportional-autoscaler:1.8.4
            resources:
              requests:
                cpu: "200m"
                memory: "150Mi"
            command:
            - /cluster-proportional-autoscaler
            - --namespace=kube-system
            - --configmap=dns-autoscaler
            - --nodelabels=type!=virtual-kubelet
            - --target=Deployment/coredns
            - --default-params={"linear":{"coresPerReplica":64,"nodesPerReplica":8,"min":2,"max":100,"preventSinglePointFailure":true}}
            - --logtostderr=true
            - --v=9
  • Manual adjustments

    You can manually adjust the number of CoreDNS pods by running the following command.

    kubectl scale --replicas={target} deployment/coredns -n kube-system # Replace {target} with the desired pod count
  • Do not use workload autoscaling

    Although workload autoscaling mechanisms such as horizontal pod autoscaling (HPA) or CronHPA can automatically adjust the number of pods, they trigger frequent scaling operations. Due to the resolution anomalies that can occur during a pod scale-in as described earlier, do not use workload autoscaling to control the number of CoreDNS pods.

Adjust CoreDNS pod specifications

Another way to adjust CoreDNS resources is to modify the pod specifications. In ACK managed clusters Pro, CoreDNS pods have a default memory limit of 2Gi and no CPU limit. We recommend that you set the CPU limit to 4096m, with a minimum of 1024m. You can adjust the CoreDNS pod configuration in the console.

Modify the CoreDNS configuration in the console

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Add-ons.

  3. Click the Network tab, find the CoreDNS card, and then click Configuration.

  4. Modify the CoreDNS settings and click OK.

    In the CoreDNS parameter configuration dialog box, set resource parameters such as MemoryRequest (for example, 100Mi), CpuRequest (for example, 100m), MemoryLimit (for example, 2Gi), and CpuLimit. You can also set NodeSelector node selection labels (for example, Key: kubernetes.io/os, Value: linux). Modifying these parameters regenerates the component template YAML, which may overwrite changes that are made using kubectl or other methods.

Schedule CoreDNS pods

Important

Incorrect scheduling configurations may prevent CoreDNS pods from being deployed, which can cause CoreDNS to fail. Before you proceed, make sure that you fully understand scheduling.

Deploy CoreDNS pods across different zones and cluster nodes to avoid single-node or single-zone failures. CoreDNS versions earlier than v1.8.4.3 use weak node anti-affinity by default. This may cause some or all pods to be deployed on the same node due to insufficient node resources. If this occurs, you can delete the pods to trigger rescheduling or upgrade to the latest component version. CoreDNS versions earlier than v1.8 are no longer maintained. We recommend that you upgrade them as soon as possible.

Avoid deploying CoreDNS on cluster nodes that have full CPU or memory utilization because this affects DNS QPS and response latency. When possible, use custom parameters to schedule CoreDNS on dedicated cluster nodes to ensure stable domain name resolution.

Deploy CoreDNS on dedicated nodes using custom parameters

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Nodes > Nodes.

  3. On the Nodes page, click Manage Labels and Taints.

  4. On the Manage Labels and Taints page, select the target nodes and click Add Label.

    Note

    Select more nodes than the number of CoreDNS replicas to avoid deploying multiple CoreDNS replicas on a single node.

  5. In the Add dialog box, set the following parameters and click OK.

    • Name: node-role-type

    • Value: coredns

  6. In the navigation pane on the left of the cluster management page, select Operations > Add-ons. Search for CoreDNS.

  7. On the CoreDNS card, click Configuration. In the Configuration dialog box, click + Add next to NodeSelector, set the following parameters, and then click OK.

    • Key: node-role-type

    • Value: coredns

    CoreDNS is rescheduled to the nodes that have the specified label.

Optimize CoreDNS configuration

Container Service for Kubernetes (ACK) provides only default CoreDNS configurations. You should review all parameters and optimize them to ensure that CoreDNS properly serves your application containers. The CoreDNS configuration is highly flexible. For more information, see DNS policy configuration and domain name resolution and the official CoreDNS documentation.

Default CoreDNS configurations that are deployed with older Kubernetes versions may have risks. You can check for and optimize these configurations in the following ways:

You can also use the scheduled inspection and fault diagnosis features in Container Intelligence Operations to check the CoreDNS configuration files. If Container Intelligence Operations reports CoreDNS ConfigMap configuration anomalies, review the items mentioned above.

Note

CoreDNS may consume extra memory when it reloads configurations. After you modify CoreDNS configuration items, monitor the pod status. If the pods experience memory shortages, promptly increase the memory limit in the CoreDNS Deployment. We recommend that you set the memory to 2 GB.

Disable affinity settings for kube-dns service

Affinity settings may cause significant load imbalances among CoreDNS replicas. You can disable them in one of the following ways:

Console method

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Network > Services.

  3. In the kube-system namespace, click Edit YAML to the right of the kube-dns service.

    • If the value of the sessionAffinity field is None, you can skip the following steps.

    • If the value of sessionAffinity is ClientIP, proceed with the following steps.

  4. Delete the sessionAffinity and sessionAffinityConfig fields and all their subkeys. Then, click Update.

    # Delete all the following content.
    sessionAffinity: ClientIP
    sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800
  5. Click Edit YAML again to the right of the kube-dns service and verify that the value of the sessionAffinity field is None. If the value of the field is None, the Kube-DNS service has been successfully updated.

Command-line method

  1. Run the following command to view the kube-dns service configuration.

    kubectl -n kube-system get svc kube-dns -o yaml
    • If the value of the sessionAffinity field is None, you can skip the following steps.

    • If the value of sessionAffinity is ClientIP, proceed with the following steps.

  2. Run the following command to open and edit the kube-dns service.

    kubectl -n kube-system edit service kube-dns
  3. Delete all sessionAffinity-related settings (sessionAffinity, sessionAffinityConfig, and all subkeys). Then, save the change and exit.

    # Delete all the following content.
    sessionAffinity: ClientIP
    sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800
  4. After the modification, run the following command again to verify that the value of the sessionAffinity field is None. If it is None, the kube-dns service update is successful.

    kubectl -n kube-system get svc kube-dns -o yaml

Disable the Autopath plugin

Some older versions of CoreDNS have the Autopath plugin enabled. This plugin may produce incorrect resolution results in extreme scenarios. You can verify whether the plugin is enabled and disable it by editing the configuration file. For more information, see Autopath.

Note

After you disable the Autopath plugin, the client-side DNS query QPS may increase by up to three times, and the single-domain resolution time may increase by up to three times. You should monitor the CoreDNS load and the impact on your business.

  1. Run the kubectl -n kube-system edit configmap coredns command to open the CoreDNS configuration file.

  2. Delete the autopath @kubernetes line and save the change.

  3. Check the CoreDNS pod status and logs. If the logs contain the word reload, the modification is successful.

Configure graceful shutdown for CoreDNS

The lameduck mechanism in CoreDNS enables graceful shutdown. When CoreDNS stops or restarts, this mechanism ensures that ongoing requests are completed normally without being abruptly interrupted. The lameduck mechanism works as follows:

  • When CoreDNS terminates, it enters Lameduck mode.

  • In lameduck mode, CoreDNS stops accepting new requests but continues to process existing requests until they are complete or until the lameduck timeout period expires.

Console method

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Configurations > ConfigMaps.

  3. In the kube-system namespace, click Edit YAML to the right of the coredns configuration item.

  4. Refer to the following CoreDNS configuration file. Make sure that the health plugin is enabled and set the lameduck timeout to 15s. Then, click OK.

  5. .:53 {
            errors       
            # The health plugin may have different default settings in various CoreDNS versions.
            # Case 1: health plugin disabled by default.   
            # Case 2: health plugin enabled by default but lameduck time not set.
            # health      
            # Case 3: health plugin enabled by default with lameduck time set to 5s.   
            # health {
            #     lameduck 5s
            # }      
            # For all three cases, modify uniformly as follows to set lameduck to 15s.
            health {
                lameduck 15s
            }       
            # Do not modify other plugins; omitted here.
        }

If the CoreDNS pods run as normal, the graceful shutdown configuration is successfully updated. If the pods become abnormal, check the pod events and logs to identify the cause.

Command-line method

  1. Run the following command to open the CoreDNS configuration file.

  2. kubectl -n kube-system edit configmap/coredns
  3. Refer to the following Corefile. Make sure that the health plugin is enabled and set the lameduck parameter to 15s.

  4. .:53 {
            errors     
            # The health plugin may have different default settings in various CoreDNS versions.
            # Case 1: health plugin disabled by default.     
            # Case 2: health plugin enabled by default but lameduck time not set.
            # health
            # Case 3: health plugin enabled by default with lameduck time set to 5s.   
            # health {
            #     lameduck 5s
            # }
            # For all three cases, modify uniformly as follows to set lameduck to 15s.
            health {
                lameduck 15s
            }
            # Do not modify other plugins; omitted here.
        }
  5. Save the change and exit after you modify the CoreDNS configuration file.

  6. If CoreDNS runs as normal, the graceful shutdown configuration is successfully updated. If the pods become abnormal, check the pod events and logs to identify the cause.

Set the default protocol for the Forward plugin to communicate with upstream VPC DNS servers

NodeLocal DNSCache communicates with CoreDNS using TCP. CoreDNS uses the same protocol as the incoming request when it communicates with upstream DNS servers. By default, external domain resolution requests from application containers pass through NodeLocal DNSCache and CoreDNS, and then reach the VPC DNS servers (100.100.2.136 and 100.100.2.138) using TCP.

VPC DNS servers have limited support for TCP. If you use NodeLocal DNSCache, you should modify the CoreDNS configuration to always prefer UDP when communicating with upstream DNS servers to avoid resolution anomalies. You can modify the ConfigMap named coredns in the kube-system namespace as follows. For more information, see Manage ConfigMaps. In the forward plugin, specify prefer_udp as the upstream protocol. After this modification, CoreDNS prefers to use UDP for upstream communication:

# Before modification
forward . /etc/resolv.conf
# After modification
forward . /etc/resolv.conf {
  prefer_udp
}

Configure the Ready readiness probe plugin

CoreDNS versions later than 1.5.0 require the ready plugin to enable readiness probes.

  1. Run the following command to open the CoreDNS configuration file.

    kubectl -n kube-system edit configmap/coredns
  2. Check for the ready line. If it is missing, add ready. Press Esc, type :wq!, and then press Enter to save the change and exit.

    apiVersion: v1
    data:
     Corefile: |
      .:53 {
        errors
        health {
          lameduck 15s
        }
        ready # Add this line if missing, ensuring consistent indentation with Kubernetes.
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods verified
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          max_concurrent 1000
                prefer_udp
        }
        cache 30
        loop
        log
        reload
        loadbalance
      }
  3. Check the CoreDNS pod status and logs. If the logs contain the word reload, the modification is successful.

Configure the multisocket plugin to enhance CoreDNS resolution performance

CoreDNS v1.12.1 introduced the multisocket plugin. If you enable this plugin, CoreDNS can use multiple sockets to listen on the same port, which enhances performance in high-CPU scenarios. For more information about the plugin, see the community documentation.

You can enable multisocket in the coredns ConfigMap:

.:53 {
        ...
        prometheus :9153
        multisocket [NUM_SOCKETS]
        forward . /etc/resolv.conf
        ...
}

NUM_SOCKETS specifies the number of sockets that listen on the same port.

Configuration recommendations: We recommend that you align the value of NUM_SOCKETS with the estimated CPU usage, CPU resource limits, and available resources in the cluster. For example:

  • If CoreDNS consumes 4 cores at its peak and 8 cores are available, set NUM_SOCKETS to 2.

  • If CoreDNS consumes 8 cores at its peak and 64 cores are available, set NUM_SOCKETS to 8.

To determine the optimal configuration, you can test different settings and measure the QPS and load.

If you do not specify NUM_SOCKETS, CoreDNS uses GOMAXPROCS by default. The value of `GOMAXPROCS` is equal to the CPU limit of the CoreDNS pod. If no CPU limit is set, the value is equal to the number of CPU cores on the node.